Cargando…

A Novel Clustering Method Based on Adjacent Grids Searching

Clustering is used to analyze the intrinsic structure of a dataset based on the similarity of datapoints. Its widespread use, from image segmentation to object recognition and information retrieval, requires great robustness in the clustering process. In this paper, a novel clustering method based o...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Zhimeng, Zhong, Wen, Liao, Weiwen, Zhao, Jian, Yu, Ming, He, Gaiyun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10528124/
https://www.ncbi.nlm.nih.gov/pubmed/37761640
http://dx.doi.org/10.3390/e25091342
_version_ 1785111226853883904
author Li, Zhimeng
Zhong, Wen
Liao, Weiwen
Zhao, Jian
Yu, Ming
He, Gaiyun
author_facet Li, Zhimeng
Zhong, Wen
Liao, Weiwen
Zhao, Jian
Yu, Ming
He, Gaiyun
author_sort Li, Zhimeng
collection PubMed
description Clustering is used to analyze the intrinsic structure of a dataset based on the similarity of datapoints. Its widespread use, from image segmentation to object recognition and information retrieval, requires great robustness in the clustering process. In this paper, a novel clustering method based on adjacent grid searching (CAGS) is proposed. The CAGS consists of two steps: a strategy based on adaptive grid-space construction and a clustering strategy based on adjacent grid searching. In the first step, a multidimensional grid space is constructed to provide a quantization structure of the input dataset. The noise and cluster halo are automatically distinguished according to grid density. Moreover, the adaptive grid generating process solves the common problem of grid clustering, in which the number of cells increases sharply with the dimension. In the second step, a two-stage traversal process is conducted to accomplish the cluster recognition. The cluster cores with arbitrary shapes can be found by concealing the halo points. As a result, the number of clusters will be easily identified by CAGS. Therefore, CAGS has the potential to be widely used for clustering datasets with different characteristics. We test the clustering performance of CAGS through six different types of datasets: dataset with noise, large-scale dataset, high-dimensional dataset, dataset with arbitrary shapes, dataset with large differences in density between classes, and dataset with high overlap between classes. Experimental results show that CAGS, which performed best on 10 out of 11 tests, outperforms the state-of-the-art clustering methods in all the above datasets.
format Online
Article
Text
id pubmed-10528124
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-105281242023-09-28 A Novel Clustering Method Based on Adjacent Grids Searching Li, Zhimeng Zhong, Wen Liao, Weiwen Zhao, Jian Yu, Ming He, Gaiyun Entropy (Basel) Article Clustering is used to analyze the intrinsic structure of a dataset based on the similarity of datapoints. Its widespread use, from image segmentation to object recognition and information retrieval, requires great robustness in the clustering process. In this paper, a novel clustering method based on adjacent grid searching (CAGS) is proposed. The CAGS consists of two steps: a strategy based on adaptive grid-space construction and a clustering strategy based on adjacent grid searching. In the first step, a multidimensional grid space is constructed to provide a quantization structure of the input dataset. The noise and cluster halo are automatically distinguished according to grid density. Moreover, the adaptive grid generating process solves the common problem of grid clustering, in which the number of cells increases sharply with the dimension. In the second step, a two-stage traversal process is conducted to accomplish the cluster recognition. The cluster cores with arbitrary shapes can be found by concealing the halo points. As a result, the number of clusters will be easily identified by CAGS. Therefore, CAGS has the potential to be widely used for clustering datasets with different characteristics. We test the clustering performance of CAGS through six different types of datasets: dataset with noise, large-scale dataset, high-dimensional dataset, dataset with arbitrary shapes, dataset with large differences in density between classes, and dataset with high overlap between classes. Experimental results show that CAGS, which performed best on 10 out of 11 tests, outperforms the state-of-the-art clustering methods in all the above datasets. MDPI 2023-09-15 /pmc/articles/PMC10528124/ /pubmed/37761640 http://dx.doi.org/10.3390/e25091342 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Li, Zhimeng
Zhong, Wen
Liao, Weiwen
Zhao, Jian
Yu, Ming
He, Gaiyun
A Novel Clustering Method Based on Adjacent Grids Searching
title A Novel Clustering Method Based on Adjacent Grids Searching
title_full A Novel Clustering Method Based on Adjacent Grids Searching
title_fullStr A Novel Clustering Method Based on Adjacent Grids Searching
title_full_unstemmed A Novel Clustering Method Based on Adjacent Grids Searching
title_short A Novel Clustering Method Based on Adjacent Grids Searching
title_sort novel clustering method based on adjacent grids searching
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10528124/
https://www.ncbi.nlm.nih.gov/pubmed/37761640
http://dx.doi.org/10.3390/e25091342
work_keys_str_mv AT lizhimeng anovelclusteringmethodbasedonadjacentgridssearching
AT zhongwen anovelclusteringmethodbasedonadjacentgridssearching
AT liaoweiwen anovelclusteringmethodbasedonadjacentgridssearching
AT zhaojian anovelclusteringmethodbasedonadjacentgridssearching
AT yuming anovelclusteringmethodbasedonadjacentgridssearching
AT hegaiyun anovelclusteringmethodbasedonadjacentgridssearching
AT lizhimeng novelclusteringmethodbasedonadjacentgridssearching
AT zhongwen novelclusteringmethodbasedonadjacentgridssearching
AT liaoweiwen novelclusteringmethodbasedonadjacentgridssearching
AT zhaojian novelclusteringmethodbasedonadjacentgridssearching
AT yuming novelclusteringmethodbasedonadjacentgridssearching
AT hegaiyun novelclusteringmethodbasedonadjacentgridssearching