Cargando…

A unified approach for cluster-wise and general noise rejection approaches for k-means clustering

Hard C-means (HCM; k-means) is one of the most widely used partitive clustering techniques. However, HCM is strongly affected by noise objects and cannot represent cluster overlap. To reduce the influence of noise objects, objects distant from cluster centers are rejected in some noise rejection app...

Descripción completa

Detalles Bibliográficos
Autor principal: Ubukata, Seiki
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924505/
https://www.ncbi.nlm.nih.gov/pubmed/33816891
http://dx.doi.org/10.7717/peerj-cs.238
Descripción
Sumario:Hard C-means (HCM; k-means) is one of the most widely used partitive clustering techniques. However, HCM is strongly affected by noise objects and cannot represent cluster overlap. To reduce the influence of noise objects, objects distant from cluster centers are rejected in some noise rejection approaches including general noise rejection (GNR) and cluster-wise noise rejection (CNR). Generalized rough C-means (GRCM) can deal with positive, negative, and boundary belonging of object to clusters by reference to rough set theory. GRCM realizes cluster overlap by the linear function threshold-based object-cluster assignment. In this study, as a unified approach for GNR and CNR in HCM, we propose linear function threshold-based C-means (LiFTCM) by relaxing GRCM. We show that the linear function threshold-based assignment in LiFTCM includes GNR, CNR, and their combinations as well as rough assignment of GRCM. The classification boundary is visualized so that the characteristics of LiFTCM in various parameter settings are clarified. Numerical experiments demonstrate that the combinations of rough clustering or the combinations of GNR and CNR realized by LiFTCM yield satisfactory results.