Cargando…

Identifying bias in network clustering quality metrics

We study potential biases of popular network clustering quality metrics, such as those based on the dichotomy between internal and external connectivity. We propose a method that uses both stochastic and preferential attachment block models construction to generate networks with preset community str...

Descripción completa

Detalles Bibliográficos
Autores principales: Renedo-Mirambell, Martí, Arratia, Argimiro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10495975/
https://www.ncbi.nlm.nih.gov/pubmed/37705625
http://dx.doi.org/10.7717/peerj-cs.1523
_version_ 1785105009026793472
author Renedo-Mirambell, Martí
Arratia, Argimiro
author_facet Renedo-Mirambell, Martí
Arratia, Argimiro
author_sort Renedo-Mirambell, Martí
collection PubMed
description We study potential biases of popular network clustering quality metrics, such as those based on the dichotomy between internal and external connectivity. We propose a method that uses both stochastic and preferential attachment block models construction to generate networks with preset community structures, and Poisson or scale-free degree distribution, to which quality metrics will be applied. These models also allow us to generate multi-level structures of varying strength, which will show if metrics favour partitions into a larger or smaller number of clusters. Additionally, we propose another quality metric, the density ratio. We observed that most of the studied metrics tend to favour partitions into a smaller number of big clusters, even when their relative internal and external connectivity are the same. The metrics found to be less biased are modularity and density ratio.
format Online
Article
Text
id pubmed-10495975
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-104959752023-09-13 Identifying bias in network clustering quality metrics Renedo-Mirambell, Martí Arratia, Argimiro PeerJ Comput Sci Algorithms and Analysis of Algorithms We study potential biases of popular network clustering quality metrics, such as those based on the dichotomy between internal and external connectivity. We propose a method that uses both stochastic and preferential attachment block models construction to generate networks with preset community structures, and Poisson or scale-free degree distribution, to which quality metrics will be applied. These models also allow us to generate multi-level structures of varying strength, which will show if metrics favour partitions into a larger or smaller number of clusters. Additionally, we propose another quality metric, the density ratio. We observed that most of the studied metrics tend to favour partitions into a smaller number of big clusters, even when their relative internal and external connectivity are the same. The metrics found to be less biased are modularity and density ratio. PeerJ Inc. 2023-08-17 /pmc/articles/PMC10495975/ /pubmed/37705625 http://dx.doi.org/10.7717/peerj-cs.1523 Text en © 2023 Renedo-Mirambell and Arratia https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Algorithms and Analysis of Algorithms
Renedo-Mirambell, Martí
Arratia, Argimiro
Identifying bias in network clustering quality metrics
title Identifying bias in network clustering quality metrics
title_full Identifying bias in network clustering quality metrics
title_fullStr Identifying bias in network clustering quality metrics
title_full_unstemmed Identifying bias in network clustering quality metrics
title_short Identifying bias in network clustering quality metrics
title_sort identifying bias in network clustering quality metrics
topic Algorithms and Analysis of Algorithms
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10495975/
https://www.ncbi.nlm.nih.gov/pubmed/37705625
http://dx.doi.org/10.7717/peerj-cs.1523
work_keys_str_mv AT renedomirambellmarti identifyingbiasinnetworkclusteringqualitymetrics
AT arratiaargimiro identifyingbiasinnetworkclusteringqualitymetrics