Cargando…

A new Kmeans clustering model and its generalization achieved by joint spectral embedding and rotation

The Kmeans clustering and spectral clustering are two popular clustering methods for grouping similar data points together according to their similarities. However, the performance of Kmeans clustering might be quite unstable due to the random initialization of the cluster centroids. Generally, spec...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Wenna, Peng, Yong, Ge, Yuan, Kong, Wanzeng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8022527/
https://www.ncbi.nlm.nih.gov/pubmed/33834111
http://dx.doi.org/10.7717/peerj-cs.450
_version_ 1783674947497885696
author Huang, Wenna
Peng, Yong
Ge, Yuan
Kong, Wanzeng
author_facet Huang, Wenna
Peng, Yong
Ge, Yuan
Kong, Wanzeng
author_sort Huang, Wenna
collection PubMed
description The Kmeans clustering and spectral clustering are two popular clustering methods for grouping similar data points together according to their similarities. However, the performance of Kmeans clustering might be quite unstable due to the random initialization of the cluster centroids. Generally, spectral clustering methods employ a two-step strategy of spectral embedding and discretization postprocessing to obtain the cluster assignment, which easily lead to far deviation from true discrete solution during the postprocessing process. In this paper, based on the connection between the Kmeans clustering and spectral clustering, we propose a new Kmeans formulation by joint spectral embedding and spectral rotation which is an effective postprocessing approach to perform the discretization, termed KMSR. Further, instead of directly using the dot-product data similarity measure, we make generalization on KMSR by incorporating more advanced data similarity measures and call this generalized model as KMSR-G. An efficient optimization method is derived to solve the KMSR (KMSR-G) model objective whose complexity and convergence are provided. We conduct experiments on extensive benchmark datasets to validate the performance of our proposed models and the experimental results demonstrate that our models perform better than the related methods in most cases.
format Online
Article
Text
id pubmed-8022527
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-80225272021-04-07 A new Kmeans clustering model and its generalization achieved by joint spectral embedding and rotation Huang, Wenna Peng, Yong Ge, Yuan Kong, Wanzeng PeerJ Comput Sci Algorithms and Analysis of Algorithms The Kmeans clustering and spectral clustering are two popular clustering methods for grouping similar data points together according to their similarities. However, the performance of Kmeans clustering might be quite unstable due to the random initialization of the cluster centroids. Generally, spectral clustering methods employ a two-step strategy of spectral embedding and discretization postprocessing to obtain the cluster assignment, which easily lead to far deviation from true discrete solution during the postprocessing process. In this paper, based on the connection between the Kmeans clustering and spectral clustering, we propose a new Kmeans formulation by joint spectral embedding and spectral rotation which is an effective postprocessing approach to perform the discretization, termed KMSR. Further, instead of directly using the dot-product data similarity measure, we make generalization on KMSR by incorporating more advanced data similarity measures and call this generalized model as KMSR-G. An efficient optimization method is derived to solve the KMSR (KMSR-G) model objective whose complexity and convergence are provided. We conduct experiments on extensive benchmark datasets to validate the performance of our proposed models and the experimental results demonstrate that our models perform better than the related methods in most cases. PeerJ Inc. 2021-03-30 /pmc/articles/PMC8022527/ /pubmed/33834111 http://dx.doi.org/10.7717/peerj-cs.450 Text en ©2021 Huang et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Algorithms and Analysis of Algorithms
Huang, Wenna
Peng, Yong
Ge, Yuan
Kong, Wanzeng
A new Kmeans clustering model and its generalization achieved by joint spectral embedding and rotation
title A new Kmeans clustering model and its generalization achieved by joint spectral embedding and rotation
title_full A new Kmeans clustering model and its generalization achieved by joint spectral embedding and rotation
title_fullStr A new Kmeans clustering model and its generalization achieved by joint spectral embedding and rotation
title_full_unstemmed A new Kmeans clustering model and its generalization achieved by joint spectral embedding and rotation
title_short A new Kmeans clustering model and its generalization achieved by joint spectral embedding and rotation
title_sort new kmeans clustering model and its generalization achieved by joint spectral embedding and rotation
topic Algorithms and Analysis of Algorithms
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8022527/
https://www.ncbi.nlm.nih.gov/pubmed/33834111
http://dx.doi.org/10.7717/peerj-cs.450
work_keys_str_mv AT huangwenna anewkmeansclusteringmodelanditsgeneralizationachievedbyjointspectralembeddingandrotation
AT pengyong anewkmeansclusteringmodelanditsgeneralizationachievedbyjointspectralembeddingandrotation
AT geyuan anewkmeansclusteringmodelanditsgeneralizationachievedbyjointspectralembeddingandrotation
AT kongwanzeng anewkmeansclusteringmodelanditsgeneralizationachievedbyjointspectralembeddingandrotation
AT huangwenna newkmeansclusteringmodelanditsgeneralizationachievedbyjointspectralembeddingandrotation
AT pengyong newkmeansclusteringmodelanditsgeneralizationachievedbyjointspectralembeddingandrotation
AT geyuan newkmeansclusteringmodelanditsgeneralizationachievedbyjointspectralembeddingandrotation
AT kongwanzeng newkmeansclusteringmodelanditsgeneralizationachievedbyjointspectralembeddingandrotation