Cargando…

An Improved Density Peak Clustering Algorithm for Multi-Density Data

Density peak clustering is the latest classic density-based clustering algorithm, which can directly find the cluster center without iteration. The algorithm needs to determine a unique parameter, so the selection of parameters is particularly important. However, for multi-density data, when one par...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yin, Lifeng, Wang, Yingfeng, Chen, Huayue, Deng, Wu
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9695166/ https://www.ncbi.nlm.nih.gov/pubmed/36433414 http://dx.doi.org/10.3390/s22228814

_version_	1784837988144906240
author	Yin, Lifeng Wang, Yingfeng Chen, Huayue Deng, Wu
author_facet	Yin, Lifeng Wang, Yingfeng Chen, Huayue Deng, Wu
author_sort	Yin, Lifeng
collection	PubMed
description	Density peak clustering is the latest classic density-based clustering algorithm, which can directly find the cluster center without iteration. The algorithm needs to determine a unique parameter, so the selection of parameters is particularly important. However, for multi-density data, when one parameter cannot satisfy all data, clustering often cannot achieve good results. Moreover, the subjective selection of cluster centers through decision diagrams is often not very convincing, and there are also certain errors. In view of the above problems, in order to achieve better clustering of multi-density data, this paper improves the density peak clustering algorithm. Aiming at the selection of parameter d(c), the K-nearest neighbor idea is used to sort the neighbor distance of each data, draw a line graph of the K-nearest neighbor distance, and find the global bifurcation point to divide the data with different densities. Aiming at the selection of cluster centers, the local density and distance of each data point in each data division is found, a γ map is drawn, the average value of the γ height difference is calculated, and through two screenings the largest discontinuity point is found to automatically determine the cluster center and the number of cluster centers. The divided datasets are clustered by the DPC algorithm, and then the clustering results are perfected and integrated by using the cluster fusion rules. Finally, a variety of experiments are designed from various perspectives on various artificial simulated datasets and UCI real datasets, which demonstrate the superiority of the F-DPC algorithm in terms of clustering effect, clustering quality, and number of samples.
format	Online Article Text
id	pubmed-9695166
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-96951662022-11-26 An Improved Density Peak Clustering Algorithm for Multi-Density Data Yin, Lifeng Wang, Yingfeng Chen, Huayue Deng, Wu Sensors (Basel) Article Density peak clustering is the latest classic density-based clustering algorithm, which can directly find the cluster center without iteration. The algorithm needs to determine a unique parameter, so the selection of parameters is particularly important. However, for multi-density data, when one parameter cannot satisfy all data, clustering often cannot achieve good results. Moreover, the subjective selection of cluster centers through decision diagrams is often not very convincing, and there are also certain errors. In view of the above problems, in order to achieve better clustering of multi-density data, this paper improves the density peak clustering algorithm. Aiming at the selection of parameter d(c), the K-nearest neighbor idea is used to sort the neighbor distance of each data, draw a line graph of the K-nearest neighbor distance, and find the global bifurcation point to divide the data with different densities. Aiming at the selection of cluster centers, the local density and distance of each data point in each data division is found, a γ map is drawn, the average value of the γ height difference is calculated, and through two screenings the largest discontinuity point is found to automatically determine the cluster center and the number of cluster centers. The divided datasets are clustered by the DPC algorithm, and then the clustering results are perfected and integrated by using the cluster fusion rules. Finally, a variety of experiments are designed from various perspectives on various artificial simulated datasets and UCI real datasets, which demonstrate the superiority of the F-DPC algorithm in terms of clustering effect, clustering quality, and number of samples. MDPI 2022-11-15 /pmc/articles/PMC9695166/ /pubmed/36433414 http://dx.doi.org/10.3390/s22228814 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Yin, Lifeng Wang, Yingfeng Chen, Huayue Deng, Wu An Improved Density Peak Clustering Algorithm for Multi-Density Data
title	An Improved Density Peak Clustering Algorithm for Multi-Density Data
title_full	An Improved Density Peak Clustering Algorithm for Multi-Density Data
title_fullStr	An Improved Density Peak Clustering Algorithm for Multi-Density Data
title_full_unstemmed	An Improved Density Peak Clustering Algorithm for Multi-Density Data
title_short	An Improved Density Peak Clustering Algorithm for Multi-Density Data
title_sort	improved density peak clustering algorithm for multi-density data
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9695166/ https://www.ncbi.nlm.nih.gov/pubmed/36433414 http://dx.doi.org/10.3390/s22228814
work_keys_str_mv	AT yinlifeng animproveddensitypeakclusteringalgorithmformultidensitydata AT wangyingfeng animproveddensitypeakclusteringalgorithmformultidensitydata AT chenhuayue animproveddensitypeakclusteringalgorithmformultidensitydata AT dengwu animproveddensitypeakclusteringalgorithmformultidensitydata AT yinlifeng improveddensitypeakclusteringalgorithmformultidensitydata AT wangyingfeng improveddensitypeakclusteringalgorithmformultidensitydata AT chenhuayue improveddensitypeakclusteringalgorithmformultidensitydata AT dengwu improveddensitypeakclusteringalgorithmformultidensitydata

An Improved Density Peak Clustering Algorithm for Multi-Density Data

Ejemplares similares