Cargando…
On the statistical significance of communities from weighted graphs
Community detection is a fundamental procedure in the analysis of network data. Despite decades of research, there is still no consensus on the definition of a community. To analytically test the realness of a candidate community in weighted networks, we present a general formulation from a signific...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8514603/ https://www.ncbi.nlm.nih.gov/pubmed/34645850 http://dx.doi.org/10.1038/s41598-021-99175-2 |
_version_ | 1784583426974679040 |
---|---|
author | He, Zengyou Chen, Wenfang Wei, Xiaoqi Liu, Yan |
author_facet | He, Zengyou Chen, Wenfang Wei, Xiaoqi Liu, Yan |
author_sort | He, Zengyou |
collection | PubMed |
description | Community detection is a fundamental procedure in the analysis of network data. Despite decades of research, there is still no consensus on the definition of a community. To analytically test the realness of a candidate community in weighted networks, we present a general formulation from a significance testing perspective. In this new formulation, the edge-weight is modeled as a censored observation due to the noisy characteristics of real networks. In particular, the edge-weights of missing links are incorporated as well, which are specified to be zeros based on the assumption that they are truncated or unobserved. Thereafter, the community significance assessment issue is formulated as a two-sample test problem on censored data. More precisely, the Logrank test is employed to conduct the significance testing on two sets of augmented edge-weights: internal weight set and external weight set. The presented approach is evaluated on both weighted networks and un-weighted networks. The experimental results show that our method can outperform prior widely used evaluation metrics on the task of individual community validation. |
format | Online Article Text |
id | pubmed-8514603 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-85146032021-10-15 On the statistical significance of communities from weighted graphs He, Zengyou Chen, Wenfang Wei, Xiaoqi Liu, Yan Sci Rep Article Community detection is a fundamental procedure in the analysis of network data. Despite decades of research, there is still no consensus on the definition of a community. To analytically test the realness of a candidate community in weighted networks, we present a general formulation from a significance testing perspective. In this new formulation, the edge-weight is modeled as a censored observation due to the noisy characteristics of real networks. In particular, the edge-weights of missing links are incorporated as well, which are specified to be zeros based on the assumption that they are truncated or unobserved. Thereafter, the community significance assessment issue is formulated as a two-sample test problem on censored data. More precisely, the Logrank test is employed to conduct the significance testing on two sets of augmented edge-weights: internal weight set and external weight set. The presented approach is evaluated on both weighted networks and un-weighted networks. The experimental results show that our method can outperform prior widely used evaluation metrics on the task of individual community validation. Nature Publishing Group UK 2021-10-13 /pmc/articles/PMC8514603/ /pubmed/34645850 http://dx.doi.org/10.1038/s41598-021-99175-2 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article He, Zengyou Chen, Wenfang Wei, Xiaoqi Liu, Yan On the statistical significance of communities from weighted graphs |
title | On the statistical significance of communities from weighted graphs |
title_full | On the statistical significance of communities from weighted graphs |
title_fullStr | On the statistical significance of communities from weighted graphs |
title_full_unstemmed | On the statistical significance of communities from weighted graphs |
title_short | On the statistical significance of communities from weighted graphs |
title_sort | on the statistical significance of communities from weighted graphs |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8514603/ https://www.ncbi.nlm.nih.gov/pubmed/34645850 http://dx.doi.org/10.1038/s41598-021-99175-2 |
work_keys_str_mv | AT hezengyou onthestatisticalsignificanceofcommunitiesfromweightedgraphs AT chenwenfang onthestatisticalsignificanceofcommunitiesfromweightedgraphs AT weixiaoqi onthestatisticalsignificanceofcommunitiesfromweightedgraphs AT liuyan onthestatisticalsignificanceofcommunitiesfromweightedgraphs |