Cargando…
A unified approach for cluster-wise and general noise rejection approaches for k-means clustering
Hard C-means (HCM; k-means) is one of the most widely used partitive clustering techniques. However, HCM is strongly affected by noise objects and cannot represent cluster overlap. To reduce the influence of noise objects, objects distant from cluster centers are rejected in some noise rejection app...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924505/ https://www.ncbi.nlm.nih.gov/pubmed/33816891 http://dx.doi.org/10.7717/peerj-cs.238 |
_version_ | 1783659104958414848 |
---|---|
author | Ubukata, Seiki |
author_facet | Ubukata, Seiki |
author_sort | Ubukata, Seiki |
collection | PubMed |
description | Hard C-means (HCM; k-means) is one of the most widely used partitive clustering techniques. However, HCM is strongly affected by noise objects and cannot represent cluster overlap. To reduce the influence of noise objects, objects distant from cluster centers are rejected in some noise rejection approaches including general noise rejection (GNR) and cluster-wise noise rejection (CNR). Generalized rough C-means (GRCM) can deal with positive, negative, and boundary belonging of object to clusters by reference to rough set theory. GRCM realizes cluster overlap by the linear function threshold-based object-cluster assignment. In this study, as a unified approach for GNR and CNR in HCM, we propose linear function threshold-based C-means (LiFTCM) by relaxing GRCM. We show that the linear function threshold-based assignment in LiFTCM includes GNR, CNR, and their combinations as well as rough assignment of GRCM. The classification boundary is visualized so that the characteristics of LiFTCM in various parameter settings are clarified. Numerical experiments demonstrate that the combinations of rough clustering or the combinations of GNR and CNR realized by LiFTCM yield satisfactory results. |
format | Online Article Text |
id | pubmed-7924505 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-79245052021-04-02 A unified approach for cluster-wise and general noise rejection approaches for k-means clustering Ubukata, Seiki PeerJ Comput Sci Data Mining and Machine Learning Hard C-means (HCM; k-means) is one of the most widely used partitive clustering techniques. However, HCM is strongly affected by noise objects and cannot represent cluster overlap. To reduce the influence of noise objects, objects distant from cluster centers are rejected in some noise rejection approaches including general noise rejection (GNR) and cluster-wise noise rejection (CNR). Generalized rough C-means (GRCM) can deal with positive, negative, and boundary belonging of object to clusters by reference to rough set theory. GRCM realizes cluster overlap by the linear function threshold-based object-cluster assignment. In this study, as a unified approach for GNR and CNR in HCM, we propose linear function threshold-based C-means (LiFTCM) by relaxing GRCM. We show that the linear function threshold-based assignment in LiFTCM includes GNR, CNR, and their combinations as well as rough assignment of GRCM. The classification boundary is visualized so that the characteristics of LiFTCM in various parameter settings are clarified. Numerical experiments demonstrate that the combinations of rough clustering or the combinations of GNR and CNR realized by LiFTCM yield satisfactory results. PeerJ Inc. 2019-11-18 /pmc/articles/PMC7924505/ /pubmed/33816891 http://dx.doi.org/10.7717/peerj-cs.238 Text en ©2019 Ubukata https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. |
spellingShingle | Data Mining and Machine Learning Ubukata, Seiki A unified approach for cluster-wise and general noise rejection approaches for k-means clustering |
title | A unified approach for cluster-wise and general noise rejection approaches for k-means clustering |
title_full | A unified approach for cluster-wise and general noise rejection approaches for k-means clustering |
title_fullStr | A unified approach for cluster-wise and general noise rejection approaches for k-means clustering |
title_full_unstemmed | A unified approach for cluster-wise and general noise rejection approaches for k-means clustering |
title_short | A unified approach for cluster-wise and general noise rejection approaches for k-means clustering |
title_sort | unified approach for cluster-wise and general noise rejection approaches for k-means clustering |
topic | Data Mining and Machine Learning |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924505/ https://www.ncbi.nlm.nih.gov/pubmed/33816891 http://dx.doi.org/10.7717/peerj-cs.238 |
work_keys_str_mv | AT ubukataseiki aunifiedapproachforclusterwiseandgeneralnoiserejectionapproachesforkmeansclustering AT ubukataseiki unifiedapproachforclusterwiseandgeneralnoiserejectionapproachesforkmeansclustering |