Cargando…

Weighted Mutual Information for Aggregated Kernel Clustering

Background: A common task in machine learning is clustering data into different groups based on similarities. Clustering methods can be divided in two groups: linear and nonlinear. A commonly used linear clustering method is K-means. Its extension, kernel K-means, is a non-linear technique that util...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kachouie, Nezamoddin N., Shutaywi, Meshal
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7516823/ https://www.ncbi.nlm.nih.gov/pubmed/33286125 http://dx.doi.org/10.3390/e22030351

_version_	1783587087733227520
author	Kachouie, Nezamoddin N. Shutaywi, Meshal
author_facet	Kachouie, Nezamoddin N. Shutaywi, Meshal
author_sort	Kachouie, Nezamoddin N.
collection	PubMed
description	Background: A common task in machine learning is clustering data into different groups based on similarities. Clustering methods can be divided in two groups: linear and nonlinear. A commonly used linear clustering method is K-means. Its extension, kernel K-means, is a non-linear technique that utilizes a kernel function to project the data to a higher dimensional space. The projected data will then be clustered in different groups. Different kernels do not perform similarly when they are applied to different datasets. Methods: A kernel function might be relevant for one application but perform poorly to project data for another application. In turn choosing the right kernel for an arbitrary dataset is a challenging task. To address this challenge, a potential approach is aggregating the clustering results to obtain an impartial clustering result regardless of the selected kernel function. To this end, the main challenge is how to aggregate the clustering results. A potential solution is to combine the clustering results using a weight function. In this work, we introduce Weighted Mutual Information (WMI) for calculating the weights for different clustering methods based on their performance to combine the results. The performance of each method is evaluated using a training set with known labels. Results: We applied the proposed Weighted Mutual Information to four data sets that cannot be linearly separated. We also tested the method in different noise conditions. Conclusions: Our results show that the proposed Weighted Mutual Information method is impartial, does not rely on a single kernel, and performs better than each individual kernel specially in high noise.
format	Online Article Text
id	pubmed-7516823
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-75168232020-11-09 Weighted Mutual Information for Aggregated Kernel Clustering Kachouie, Nezamoddin N. Shutaywi, Meshal Entropy (Basel) Article Background: A common task in machine learning is clustering data into different groups based on similarities. Clustering methods can be divided in two groups: linear and nonlinear. A commonly used linear clustering method is K-means. Its extension, kernel K-means, is a non-linear technique that utilizes a kernel function to project the data to a higher dimensional space. The projected data will then be clustered in different groups. Different kernels do not perform similarly when they are applied to different datasets. Methods: A kernel function might be relevant for one application but perform poorly to project data for another application. In turn choosing the right kernel for an arbitrary dataset is a challenging task. To address this challenge, a potential approach is aggregating the clustering results to obtain an impartial clustering result regardless of the selected kernel function. To this end, the main challenge is how to aggregate the clustering results. A potential solution is to combine the clustering results using a weight function. In this work, we introduce Weighted Mutual Information (WMI) for calculating the weights for different clustering methods based on their performance to combine the results. The performance of each method is evaluated using a training set with known labels. Results: We applied the proposed Weighted Mutual Information to four data sets that cannot be linearly separated. We also tested the method in different noise conditions. Conclusions: Our results show that the proposed Weighted Mutual Information method is impartial, does not rely on a single kernel, and performs better than each individual kernel specially in high noise. MDPI 2020-03-18 /pmc/articles/PMC7516823/ /pubmed/33286125 http://dx.doi.org/10.3390/e22030351 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Kachouie, Nezamoddin N. Shutaywi, Meshal Weighted Mutual Information for Aggregated Kernel Clustering
title	Weighted Mutual Information for Aggregated Kernel Clustering
title_full	Weighted Mutual Information for Aggregated Kernel Clustering
title_fullStr	Weighted Mutual Information for Aggregated Kernel Clustering
title_full_unstemmed	Weighted Mutual Information for Aggregated Kernel Clustering
title_short	Weighted Mutual Information for Aggregated Kernel Clustering
title_sort	weighted mutual information for aggregated kernel clustering
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7516823/ https://www.ncbi.nlm.nih.gov/pubmed/33286125 http://dx.doi.org/10.3390/e22030351
work_keys_str_mv	AT kachouienezamoddinn weightedmutualinformationforaggregatedkernelclustering AT shutaywimeshal weightedmutualinformationforaggregatedkernelclustering

Weighted Mutual Information for Aggregated Kernel Clustering

Ejemplares similares