Cargando…

A Self-Adaptive Fuzzy c-Means Algorithm for Determining the Optimal Number of Clusters

For the shortcoming of fuzzy c-means algorithm (FCM) needing to know the number of clusters in advance, this paper proposed a new self-adaptive method to determine the optimal number of clusters. Firstly, a density-based algorithm was put forward. The algorithm, according to the characteristics of t...

Descripción completa

Detalles Bibliográficos
Autores principales: Ren, Min, Liu, Peiyu, Wang, Zhihao, Yi, Jing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5153549/
https://www.ncbi.nlm.nih.gov/pubmed/28042291
http://dx.doi.org/10.1155/2016/2647389
_version_ 1782474714452590592
author Ren, Min
Liu, Peiyu
Wang, Zhihao
Yi, Jing
author_facet Ren, Min
Liu, Peiyu
Wang, Zhihao
Yi, Jing
author_sort Ren, Min
collection PubMed
description For the shortcoming of fuzzy c-means algorithm (FCM) needing to know the number of clusters in advance, this paper proposed a new self-adaptive method to determine the optimal number of clusters. Firstly, a density-based algorithm was put forward. The algorithm, according to the characteristics of the dataset, automatically determined the possible maximum number of clusters instead of using the empirical rule [Formula: see text] and obtained the optimal initial cluster centroids, improving the limitation of FCM that randomly selected cluster centroids lead the convergence result to the local minimum. Secondly, this paper, by introducing a penalty function, proposed a new fuzzy clustering validity index based on fuzzy compactness and separation, which ensured that when the number of clusters verged on that of objects in the dataset, the value of clustering validity index did not monotonically decrease and was close to zero, so that the optimal number of clusters lost robustness and decision function. Then, based on these studies, a self-adaptive FCM algorithm was put forward to estimate the optimal number of clusters by the iterative trial-and-error process. At last, experiments were done on the UCI, KDD Cup 1999, and synthetic datasets, which showed that the method not only effectively determined the optimal number of clusters, but also reduced the iteration of FCM with the stable clustering result.
format Online
Article
Text
id pubmed-5153549
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-51535492017-01-01 A Self-Adaptive Fuzzy c-Means Algorithm for Determining the Optimal Number of Clusters Ren, Min Liu, Peiyu Wang, Zhihao Yi, Jing Comput Intell Neurosci Research Article For the shortcoming of fuzzy c-means algorithm (FCM) needing to know the number of clusters in advance, this paper proposed a new self-adaptive method to determine the optimal number of clusters. Firstly, a density-based algorithm was put forward. The algorithm, according to the characteristics of the dataset, automatically determined the possible maximum number of clusters instead of using the empirical rule [Formula: see text] and obtained the optimal initial cluster centroids, improving the limitation of FCM that randomly selected cluster centroids lead the convergence result to the local minimum. Secondly, this paper, by introducing a penalty function, proposed a new fuzzy clustering validity index based on fuzzy compactness and separation, which ensured that when the number of clusters verged on that of objects in the dataset, the value of clustering validity index did not monotonically decrease and was close to zero, so that the optimal number of clusters lost robustness and decision function. Then, based on these studies, a self-adaptive FCM algorithm was put forward to estimate the optimal number of clusters by the iterative trial-and-error process. At last, experiments were done on the UCI, KDD Cup 1999, and synthetic datasets, which showed that the method not only effectively determined the optimal number of clusters, but also reduced the iteration of FCM with the stable clustering result. Hindawi Publishing Corporation 2016 2016-11-29 /pmc/articles/PMC5153549/ /pubmed/28042291 http://dx.doi.org/10.1155/2016/2647389 Text en Copyright © 2016 Min Ren et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Ren, Min
Liu, Peiyu
Wang, Zhihao
Yi, Jing
A Self-Adaptive Fuzzy c-Means Algorithm for Determining the Optimal Number of Clusters
title A Self-Adaptive Fuzzy c-Means Algorithm for Determining the Optimal Number of Clusters
title_full A Self-Adaptive Fuzzy c-Means Algorithm for Determining the Optimal Number of Clusters
title_fullStr A Self-Adaptive Fuzzy c-Means Algorithm for Determining the Optimal Number of Clusters
title_full_unstemmed A Self-Adaptive Fuzzy c-Means Algorithm for Determining the Optimal Number of Clusters
title_short A Self-Adaptive Fuzzy c-Means Algorithm for Determining the Optimal Number of Clusters
title_sort self-adaptive fuzzy c-means algorithm for determining the optimal number of clusters
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5153549/
https://www.ncbi.nlm.nih.gov/pubmed/28042291
http://dx.doi.org/10.1155/2016/2647389
work_keys_str_mv AT renmin aselfadaptivefuzzycmeansalgorithmfordeterminingtheoptimalnumberofclusters
AT liupeiyu aselfadaptivefuzzycmeansalgorithmfordeterminingtheoptimalnumberofclusters
AT wangzhihao aselfadaptivefuzzycmeansalgorithmfordeterminingtheoptimalnumberofclusters
AT yijing aselfadaptivefuzzycmeansalgorithmfordeterminingtheoptimalnumberofclusters
AT renmin selfadaptivefuzzycmeansalgorithmfordeterminingtheoptimalnumberofclusters
AT liupeiyu selfadaptivefuzzycmeansalgorithmfordeterminingtheoptimalnumberofclusters
AT wangzhihao selfadaptivefuzzycmeansalgorithmfordeterminingtheoptimalnumberofclusters
AT yijing selfadaptivefuzzycmeansalgorithmfordeterminingtheoptimalnumberofclusters