Cargando…

Combined Gaussian Mixture Model and Pathfinder Algorithm for Data Clustering

Data clustering is one of the most influential branches of machine learning and data analysis, and Gaussian Mixture Models (GMMs) are frequently adopted in data clustering due to their ease of implementation. However, there are certain limitations to this approach that need to be acknowledged. GMMs...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Huajuan, Liao, Zepeng, Wei, Xiuxi, Zhou, Yongquan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10296861/
https://www.ncbi.nlm.nih.gov/pubmed/37372290
http://dx.doi.org/10.3390/e25060946
_version_ 1785063747721625600
author Huang, Huajuan
Liao, Zepeng
Wei, Xiuxi
Zhou, Yongquan
author_facet Huang, Huajuan
Liao, Zepeng
Wei, Xiuxi
Zhou, Yongquan
author_sort Huang, Huajuan
collection PubMed
description Data clustering is one of the most influential branches of machine learning and data analysis, and Gaussian Mixture Models (GMMs) are frequently adopted in data clustering due to their ease of implementation. However, there are certain limitations to this approach that need to be acknowledged. GMMs need to determine the cluster numbers manually, and they may fail to extract the information within the dataset during initialization. To address these issues, a new clustering algorithm called PFA-GMM has been proposed. PFA-GMM is based on GMMs and the Pathfinder algorithm (PFA), and it aims to overcome the shortcomings of GMMs. The algorithm automatically determines the optimal number of clusters based on the dataset. Subsequently, PFA-GMM considers the clustering problem as a global optimization problem for getting trapped in local convergence during initialization. Finally, we conducted a comparative study of our proposed clustering algorithm against other well-known clustering algorithms using both synthetic and real-world datasets. The results of our experiments indicate that PFA-GMM outperformed the competing approaches.
format Online
Article
Text
id pubmed-10296861
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-102968612023-06-28 Combined Gaussian Mixture Model and Pathfinder Algorithm for Data Clustering Huang, Huajuan Liao, Zepeng Wei, Xiuxi Zhou, Yongquan Entropy (Basel) Article Data clustering is one of the most influential branches of machine learning and data analysis, and Gaussian Mixture Models (GMMs) are frequently adopted in data clustering due to their ease of implementation. However, there are certain limitations to this approach that need to be acknowledged. GMMs need to determine the cluster numbers manually, and they may fail to extract the information within the dataset during initialization. To address these issues, a new clustering algorithm called PFA-GMM has been proposed. PFA-GMM is based on GMMs and the Pathfinder algorithm (PFA), and it aims to overcome the shortcomings of GMMs. The algorithm automatically determines the optimal number of clusters based on the dataset. Subsequently, PFA-GMM considers the clustering problem as a global optimization problem for getting trapped in local convergence during initialization. Finally, we conducted a comparative study of our proposed clustering algorithm against other well-known clustering algorithms using both synthetic and real-world datasets. The results of our experiments indicate that PFA-GMM outperformed the competing approaches. MDPI 2023-06-16 /pmc/articles/PMC10296861/ /pubmed/37372290 http://dx.doi.org/10.3390/e25060946 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Huang, Huajuan
Liao, Zepeng
Wei, Xiuxi
Zhou, Yongquan
Combined Gaussian Mixture Model and Pathfinder Algorithm for Data Clustering
title Combined Gaussian Mixture Model and Pathfinder Algorithm for Data Clustering
title_full Combined Gaussian Mixture Model and Pathfinder Algorithm for Data Clustering
title_fullStr Combined Gaussian Mixture Model and Pathfinder Algorithm for Data Clustering
title_full_unstemmed Combined Gaussian Mixture Model and Pathfinder Algorithm for Data Clustering
title_short Combined Gaussian Mixture Model and Pathfinder Algorithm for Data Clustering
title_sort combined gaussian mixture model and pathfinder algorithm for data clustering
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10296861/
https://www.ncbi.nlm.nih.gov/pubmed/37372290
http://dx.doi.org/10.3390/e25060946
work_keys_str_mv AT huanghuajuan combinedgaussianmixturemodelandpathfinderalgorithmfordataclustering
AT liaozepeng combinedgaussianmixturemodelandpathfinderalgorithmfordataclustering
AT weixiuxi combinedgaussianmixturemodelandpathfinderalgorithmfordataclustering
AT zhouyongquan combinedgaussianmixturemodelandpathfinderalgorithmfordataclustering