Cargando…

A generalized fuzzy clustering framework for incomplete data by integrating feature weighted and kernel learning

Missing data presents a challenge to clustering algorithms, as traditional methods tend to pad incomplete data first before clustering. To combine the two processes of padding and clustering and improve the clustering accuracy, a generalized fuzzy clustering framework is proposed based on optimal co...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Ying, Chen, Haoyu, Wu, Haoshen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10588703/
https://www.ncbi.nlm.nih.gov/pubmed/37869452
http://dx.doi.org/10.7717/peerj-cs.1600
_version_ 1785123635940294656
author Yang, Ying
Chen, Haoyu
Wu, Haoshen
author_facet Yang, Ying
Chen, Haoyu
Wu, Haoshen
author_sort Yang, Ying
collection PubMed
description Missing data presents a challenge to clustering algorithms, as traditional methods tend to pad incomplete data first before clustering. To combine the two processes of padding and clustering and improve the clustering accuracy, a generalized fuzzy clustering framework is proposed based on optimal completion strategy (OCS) and nearest prototype strategy (NPS) with four improved algorithms developed. Feature weights are introduced to reduce outliers’ influence on the cluster centers, and kernel functions are used to solve the linear indistinguishability problem. The proposed algorithms are evaluated regarding correct clustering rate, iteration number, and external evaluation indexes with nine datasets from the UCI (University of California, Irvine) Machine Learning Repository. The results of the experiment indicate that the clustering accuracy of the feature weighted kernel fuzzy C-means algorithm with NPS (NPS-WKFCM) and feature weighted kernel fuzzy C-means algorithm with OCS (OCS-WKFCM) under varying missing rates is superior to that of seven conventional algorithms. Experiments demonstrate that the enhanced algorithm proposed for clustering incomplete data is superior.
format Online
Article
Text
id pubmed-10588703
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-105887032023-10-21 A generalized fuzzy clustering framework for incomplete data by integrating feature weighted and kernel learning Yang, Ying Chen, Haoyu Wu, Haoshen PeerJ Comput Sci Algorithms and Analysis of Algorithms Missing data presents a challenge to clustering algorithms, as traditional methods tend to pad incomplete data first before clustering. To combine the two processes of padding and clustering and improve the clustering accuracy, a generalized fuzzy clustering framework is proposed based on optimal completion strategy (OCS) and nearest prototype strategy (NPS) with four improved algorithms developed. Feature weights are introduced to reduce outliers’ influence on the cluster centers, and kernel functions are used to solve the linear indistinguishability problem. The proposed algorithms are evaluated regarding correct clustering rate, iteration number, and external evaluation indexes with nine datasets from the UCI (University of California, Irvine) Machine Learning Repository. The results of the experiment indicate that the clustering accuracy of the feature weighted kernel fuzzy C-means algorithm with NPS (NPS-WKFCM) and feature weighted kernel fuzzy C-means algorithm with OCS (OCS-WKFCM) under varying missing rates is superior to that of seven conventional algorithms. Experiments demonstrate that the enhanced algorithm proposed for clustering incomplete data is superior. PeerJ Inc. 2023-10-05 /pmc/articles/PMC10588703/ /pubmed/37869452 http://dx.doi.org/10.7717/peerj-cs.1600 Text en ©2023 Yang et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Algorithms and Analysis of Algorithms
Yang, Ying
Chen, Haoyu
Wu, Haoshen
A generalized fuzzy clustering framework for incomplete data by integrating feature weighted and kernel learning
title A generalized fuzzy clustering framework for incomplete data by integrating feature weighted and kernel learning
title_full A generalized fuzzy clustering framework for incomplete data by integrating feature weighted and kernel learning
title_fullStr A generalized fuzzy clustering framework for incomplete data by integrating feature weighted and kernel learning
title_full_unstemmed A generalized fuzzy clustering framework for incomplete data by integrating feature weighted and kernel learning
title_short A generalized fuzzy clustering framework for incomplete data by integrating feature weighted and kernel learning
title_sort generalized fuzzy clustering framework for incomplete data by integrating feature weighted and kernel learning
topic Algorithms and Analysis of Algorithms
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10588703/
https://www.ncbi.nlm.nih.gov/pubmed/37869452
http://dx.doi.org/10.7717/peerj-cs.1600
work_keys_str_mv AT yangying ageneralizedfuzzyclusteringframeworkforincompletedatabyintegratingfeatureweightedandkernellearning
AT chenhaoyu ageneralizedfuzzyclusteringframeworkforincompletedatabyintegratingfeatureweightedandkernellearning
AT wuhaoshen ageneralizedfuzzyclusteringframeworkforincompletedatabyintegratingfeatureweightedandkernellearning
AT yangying generalizedfuzzyclusteringframeworkforincompletedatabyintegratingfeatureweightedandkernellearning
AT chenhaoyu generalizedfuzzyclusteringframeworkforincompletedatabyintegratingfeatureweightedandkernellearning
AT wuhaoshen generalizedfuzzyclusteringframeworkforincompletedatabyintegratingfeatureweightedandkernellearning