Cargando…
A Novel Clustering Method Based on Adjacent Grids Searching
Clustering is used to analyze the intrinsic structure of a dataset based on the similarity of datapoints. Its widespread use, from image segmentation to object recognition and information retrieval, requires great robustness in the clustering process. In this paper, a novel clustering method based o...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10528124/ https://www.ncbi.nlm.nih.gov/pubmed/37761640 http://dx.doi.org/10.3390/e25091342 |
_version_ | 1785111226853883904 |
---|---|
author | Li, Zhimeng Zhong, Wen Liao, Weiwen Zhao, Jian Yu, Ming He, Gaiyun |
author_facet | Li, Zhimeng Zhong, Wen Liao, Weiwen Zhao, Jian Yu, Ming He, Gaiyun |
author_sort | Li, Zhimeng |
collection | PubMed |
description | Clustering is used to analyze the intrinsic structure of a dataset based on the similarity of datapoints. Its widespread use, from image segmentation to object recognition and information retrieval, requires great robustness in the clustering process. In this paper, a novel clustering method based on adjacent grid searching (CAGS) is proposed. The CAGS consists of two steps: a strategy based on adaptive grid-space construction and a clustering strategy based on adjacent grid searching. In the first step, a multidimensional grid space is constructed to provide a quantization structure of the input dataset. The noise and cluster halo are automatically distinguished according to grid density. Moreover, the adaptive grid generating process solves the common problem of grid clustering, in which the number of cells increases sharply with the dimension. In the second step, a two-stage traversal process is conducted to accomplish the cluster recognition. The cluster cores with arbitrary shapes can be found by concealing the halo points. As a result, the number of clusters will be easily identified by CAGS. Therefore, CAGS has the potential to be widely used for clustering datasets with different characteristics. We test the clustering performance of CAGS through six different types of datasets: dataset with noise, large-scale dataset, high-dimensional dataset, dataset with arbitrary shapes, dataset with large differences in density between classes, and dataset with high overlap between classes. Experimental results show that CAGS, which performed best on 10 out of 11 tests, outperforms the state-of-the-art clustering methods in all the above datasets. |
format | Online Article Text |
id | pubmed-10528124 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-105281242023-09-28 A Novel Clustering Method Based on Adjacent Grids Searching Li, Zhimeng Zhong, Wen Liao, Weiwen Zhao, Jian Yu, Ming He, Gaiyun Entropy (Basel) Article Clustering is used to analyze the intrinsic structure of a dataset based on the similarity of datapoints. Its widespread use, from image segmentation to object recognition and information retrieval, requires great robustness in the clustering process. In this paper, a novel clustering method based on adjacent grid searching (CAGS) is proposed. The CAGS consists of two steps: a strategy based on adaptive grid-space construction and a clustering strategy based on adjacent grid searching. In the first step, a multidimensional grid space is constructed to provide a quantization structure of the input dataset. The noise and cluster halo are automatically distinguished according to grid density. Moreover, the adaptive grid generating process solves the common problem of grid clustering, in which the number of cells increases sharply with the dimension. In the second step, a two-stage traversal process is conducted to accomplish the cluster recognition. The cluster cores with arbitrary shapes can be found by concealing the halo points. As a result, the number of clusters will be easily identified by CAGS. Therefore, CAGS has the potential to be widely used for clustering datasets with different characteristics. We test the clustering performance of CAGS through six different types of datasets: dataset with noise, large-scale dataset, high-dimensional dataset, dataset with arbitrary shapes, dataset with large differences in density between classes, and dataset with high overlap between classes. Experimental results show that CAGS, which performed best on 10 out of 11 tests, outperforms the state-of-the-art clustering methods in all the above datasets. MDPI 2023-09-15 /pmc/articles/PMC10528124/ /pubmed/37761640 http://dx.doi.org/10.3390/e25091342 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Li, Zhimeng Zhong, Wen Liao, Weiwen Zhao, Jian Yu, Ming He, Gaiyun A Novel Clustering Method Based on Adjacent Grids Searching |
title | A Novel Clustering Method Based on Adjacent Grids Searching |
title_full | A Novel Clustering Method Based on Adjacent Grids Searching |
title_fullStr | A Novel Clustering Method Based on Adjacent Grids Searching |
title_full_unstemmed | A Novel Clustering Method Based on Adjacent Grids Searching |
title_short | A Novel Clustering Method Based on Adjacent Grids Searching |
title_sort | novel clustering method based on adjacent grids searching |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10528124/ https://www.ncbi.nlm.nih.gov/pubmed/37761640 http://dx.doi.org/10.3390/e25091342 |
work_keys_str_mv | AT lizhimeng anovelclusteringmethodbasedonadjacentgridssearching AT zhongwen anovelclusteringmethodbasedonadjacentgridssearching AT liaoweiwen anovelclusteringmethodbasedonadjacentgridssearching AT zhaojian anovelclusteringmethodbasedonadjacentgridssearching AT yuming anovelclusteringmethodbasedonadjacentgridssearching AT hegaiyun anovelclusteringmethodbasedonadjacentgridssearching AT lizhimeng novelclusteringmethodbasedonadjacentgridssearching AT zhongwen novelclusteringmethodbasedonadjacentgridssearching AT liaoweiwen novelclusteringmethodbasedonadjacentgridssearching AT zhaojian novelclusteringmethodbasedonadjacentgridssearching AT yuming novelclusteringmethodbasedonadjacentgridssearching AT hegaiyun novelclusteringmethodbasedonadjacentgridssearching |