Cargando…

Community Partitioning over Feature-Rich Networks Using an Extended K-Means Method

This paper proposes a meaningful and effective extension of the celebrated K-means algorithm to detect communities in feature-rich networks, due to our assumption of non-summability mode. We least-squares approximate given matrices of inter-node links and feature values, leading to a straightforward...

Descripción completa

Detalles Bibliográficos
Autores principales:	Shalileh, Soroosh, Mirkin, Boris
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9142054/ https://www.ncbi.nlm.nih.gov/pubmed/35626512 http://dx.doi.org/10.3390/e24050626

_version_	1784715489783578624
author	Shalileh, Soroosh Mirkin, Boris
author_facet	Shalileh, Soroosh Mirkin, Boris
author_sort	Shalileh, Soroosh
collection	PubMed
description	This paper proposes a meaningful and effective extension of the celebrated K-means algorithm to detect communities in feature-rich networks, due to our assumption of non-summability mode. We least-squares approximate given matrices of inter-node links and feature values, leading to a straightforward extension of the conventional K-means clustering method as an alternating minimization strategy for the criterion. This works in a two-fold space, embracing both the network nodes and features. The metric used is a weighted sum of the squared Euclidean distances in the feature and network spaces. To tackle the so-called curse of dimensionality, we extend this to a version that uses the cosine distances between entities and centers. One more version of our method is based on the Manhattan distance metric. We conduct computational experiments to test our method and compare its performances with those by competing popular algorithms at synthetic and real-world datasets. The cosine-based version of the extended K-means typically wins at the high-dimension real-world datasets. In contrast, the Manhattan-based version wins at most synthetic datasets.
format	Online Article Text
id	pubmed-9142054
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-91420542022-05-28 Community Partitioning over Feature-Rich Networks Using an Extended K-Means Method Shalileh, Soroosh Mirkin, Boris Entropy (Basel) Article This paper proposes a meaningful and effective extension of the celebrated K-means algorithm to detect communities in feature-rich networks, due to our assumption of non-summability mode. We least-squares approximate given matrices of inter-node links and feature values, leading to a straightforward extension of the conventional K-means clustering method as an alternating minimization strategy for the criterion. This works in a two-fold space, embracing both the network nodes and features. The metric used is a weighted sum of the squared Euclidean distances in the feature and network spaces. To tackle the so-called curse of dimensionality, we extend this to a version that uses the cosine distances between entities and centers. One more version of our method is based on the Manhattan distance metric. We conduct computational experiments to test our method and compare its performances with those by competing popular algorithms at synthetic and real-world datasets. The cosine-based version of the extended K-means typically wins at the high-dimension real-world datasets. In contrast, the Manhattan-based version wins at most synthetic datasets. MDPI 2022-04-29 /pmc/articles/PMC9142054/ /pubmed/35626512 http://dx.doi.org/10.3390/e24050626 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Shalileh, Soroosh Mirkin, Boris Community Partitioning over Feature-Rich Networks Using an Extended K-Means Method
title	Community Partitioning over Feature-Rich Networks Using an Extended K-Means Method
title_full	Community Partitioning over Feature-Rich Networks Using an Extended K-Means Method
title_fullStr	Community Partitioning over Feature-Rich Networks Using an Extended K-Means Method
title_full_unstemmed	Community Partitioning over Feature-Rich Networks Using an Extended K-Means Method
title_short	Community Partitioning over Feature-Rich Networks Using an Extended K-Means Method
title_sort	community partitioning over feature-rich networks using an extended k-means method
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9142054/ https://www.ncbi.nlm.nih.gov/pubmed/35626512 http://dx.doi.org/10.3390/e24050626
work_keys_str_mv	AT shalilehsoroosh communitypartitioningoverfeaturerichnetworksusinganextendedkmeansmethod AT mirkinboris communitypartitioningoverfeaturerichnetworksusinganextendedkmeansmethod

Community Partitioning over Feature-Rich Networks Using an Extended K-Means Method

Ejemplares similares