Cargando…

A Robust and High-Dimensional Clustering Algorithm Based on Feature Weight and Entropy

Since the Fuzzy C-Means algorithm is incapable of considering the influence of different features and exponential constraints on high-dimensional and complex data, a fuzzy clustering algorithm based on non-Euclidean distance combining feature weights and entropy weights is proposed. The proposed alg...

Descripción completa

Detalles Bibliográficos
Autor principal: Du, Xinzhi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10048533/
https://www.ncbi.nlm.nih.gov/pubmed/36981399
http://dx.doi.org/10.3390/e25030510
_version_ 1785014220123799552
author Du, Xinzhi
author_facet Du, Xinzhi
author_sort Du, Xinzhi
collection PubMed
description Since the Fuzzy C-Means algorithm is incapable of considering the influence of different features and exponential constraints on high-dimensional and complex data, a fuzzy clustering algorithm based on non-Euclidean distance combining feature weights and entropy weights is proposed. The proposed algorithm is based on the Fuzzy C-Means soft clustering algorithm to deal with high-dimensional and complex data. The objective function of the new algorithm is modified with the help of two different entropy terms and a non-Euclidean way of computing the distance. The distance calculation formula enhances the efficiency of extracting the contribution of different features. The first entropy term helps to minimize the clusters’ dispersion and maximize the negative entropy to control the clustering process, which also promotes the association between the samples. The second entropy term helps to control the weights of features since different features have different weights in the clustering process. Experiments on real-world datasets indicate that the proposed algorithm gives better clustering results than other algorithms. The experiments demonstrate the proposed algorithm’s robustness by analyzing the parameters’ sensitivity and comparing the computational distance formulas. In summary, the improved algorithm improves classification performance under noisy interference and high-dimensional datasets, increases computational efficiency, performs well in real-world high-dimensional datasets, and encourages the development of robust noise-resistant high-dimensional fuzzy clustering algorithms.
format Online
Article
Text
id pubmed-10048533
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-100485332023-03-29 A Robust and High-Dimensional Clustering Algorithm Based on Feature Weight and Entropy Du, Xinzhi Entropy (Basel) Article Since the Fuzzy C-Means algorithm is incapable of considering the influence of different features and exponential constraints on high-dimensional and complex data, a fuzzy clustering algorithm based on non-Euclidean distance combining feature weights and entropy weights is proposed. The proposed algorithm is based on the Fuzzy C-Means soft clustering algorithm to deal with high-dimensional and complex data. The objective function of the new algorithm is modified with the help of two different entropy terms and a non-Euclidean way of computing the distance. The distance calculation formula enhances the efficiency of extracting the contribution of different features. The first entropy term helps to minimize the clusters’ dispersion and maximize the negative entropy to control the clustering process, which also promotes the association between the samples. The second entropy term helps to control the weights of features since different features have different weights in the clustering process. Experiments on real-world datasets indicate that the proposed algorithm gives better clustering results than other algorithms. The experiments demonstrate the proposed algorithm’s robustness by analyzing the parameters’ sensitivity and comparing the computational distance formulas. In summary, the improved algorithm improves classification performance under noisy interference and high-dimensional datasets, increases computational efficiency, performs well in real-world high-dimensional datasets, and encourages the development of robust noise-resistant high-dimensional fuzzy clustering algorithms. MDPI 2023-03-16 /pmc/articles/PMC10048533/ /pubmed/36981399 http://dx.doi.org/10.3390/e25030510 Text en © 2023 by the author. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Du, Xinzhi
A Robust and High-Dimensional Clustering Algorithm Based on Feature Weight and Entropy
title A Robust and High-Dimensional Clustering Algorithm Based on Feature Weight and Entropy
title_full A Robust and High-Dimensional Clustering Algorithm Based on Feature Weight and Entropy
title_fullStr A Robust and High-Dimensional Clustering Algorithm Based on Feature Weight and Entropy
title_full_unstemmed A Robust and High-Dimensional Clustering Algorithm Based on Feature Weight and Entropy
title_short A Robust and High-Dimensional Clustering Algorithm Based on Feature Weight and Entropy
title_sort robust and high-dimensional clustering algorithm based on feature weight and entropy
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10048533/
https://www.ncbi.nlm.nih.gov/pubmed/36981399
http://dx.doi.org/10.3390/e25030510
work_keys_str_mv AT duxinzhi arobustandhighdimensionalclusteringalgorithmbasedonfeatureweightandentropy
AT duxinzhi robustandhighdimensionalclusteringalgorithmbasedonfeatureweightandentropy