Cargando…

On the statistical significance of communities from weighted graphs

Community detection is a fundamental procedure in the analysis of network data. Despite decades of research, there is still no consensus on the definition of a community. To analytically test the realness of a candidate community in weighted networks, we present a general formulation from a signific...

Descripción completa

Detalles Bibliográficos
Autores principales: He, Zengyou, Chen, Wenfang, Wei, Xiaoqi, Liu, Yan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8514603/
https://www.ncbi.nlm.nih.gov/pubmed/34645850
http://dx.doi.org/10.1038/s41598-021-99175-2
_version_ 1784583426974679040
author He, Zengyou
Chen, Wenfang
Wei, Xiaoqi
Liu, Yan
author_facet He, Zengyou
Chen, Wenfang
Wei, Xiaoqi
Liu, Yan
author_sort He, Zengyou
collection PubMed
description Community detection is a fundamental procedure in the analysis of network data. Despite decades of research, there is still no consensus on the definition of a community. To analytically test the realness of a candidate community in weighted networks, we present a general formulation from a significance testing perspective. In this new formulation, the edge-weight is modeled as a censored observation due to the noisy characteristics of real networks. In particular, the edge-weights of missing links are incorporated as well, which are specified to be zeros based on the assumption that they are truncated or unobserved. Thereafter, the community significance assessment issue is formulated as a two-sample test problem on censored data. More precisely, the Logrank test is employed to conduct the significance testing on two sets of augmented edge-weights: internal weight set and external weight set. The presented approach is evaluated on both weighted networks and un-weighted networks. The experimental results show that our method can outperform prior widely used evaluation metrics on the task of individual community validation.
format Online
Article
Text
id pubmed-8514603
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-85146032021-10-15 On the statistical significance of communities from weighted graphs He, Zengyou Chen, Wenfang Wei, Xiaoqi Liu, Yan Sci Rep Article Community detection is a fundamental procedure in the analysis of network data. Despite decades of research, there is still no consensus on the definition of a community. To analytically test the realness of a candidate community in weighted networks, we present a general formulation from a significance testing perspective. In this new formulation, the edge-weight is modeled as a censored observation due to the noisy characteristics of real networks. In particular, the edge-weights of missing links are incorporated as well, which are specified to be zeros based on the assumption that they are truncated or unobserved. Thereafter, the community significance assessment issue is formulated as a two-sample test problem on censored data. More precisely, the Logrank test is employed to conduct the significance testing on two sets of augmented edge-weights: internal weight set and external weight set. The presented approach is evaluated on both weighted networks and un-weighted networks. The experimental results show that our method can outperform prior widely used evaluation metrics on the task of individual community validation. Nature Publishing Group UK 2021-10-13 /pmc/articles/PMC8514603/ /pubmed/34645850 http://dx.doi.org/10.1038/s41598-021-99175-2 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
He, Zengyou
Chen, Wenfang
Wei, Xiaoqi
Liu, Yan
On the statistical significance of communities from weighted graphs
title On the statistical significance of communities from weighted graphs
title_full On the statistical significance of communities from weighted graphs
title_fullStr On the statistical significance of communities from weighted graphs
title_full_unstemmed On the statistical significance of communities from weighted graphs
title_short On the statistical significance of communities from weighted graphs
title_sort on the statistical significance of communities from weighted graphs
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8514603/
https://www.ncbi.nlm.nih.gov/pubmed/34645850
http://dx.doi.org/10.1038/s41598-021-99175-2
work_keys_str_mv AT hezengyou onthestatisticalsignificanceofcommunitiesfromweightedgraphs
AT chenwenfang onthestatisticalsignificanceofcommunitiesfromweightedgraphs
AT weixiaoqi onthestatisticalsignificanceofcommunitiesfromweightedgraphs
AT liuyan onthestatisticalsignificanceofcommunitiesfromweightedgraphs