Cargando…
Resampling Effects on Significance Analysis of Network Clustering and Ranking
Community detection helps us simplify the complex configuration of networks, but communities are reliable only if they are statistically significant. To detect statistically significant communities, a common approach is to resample the original network and analyze the communities. But resampling ass...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3553110/ https://www.ncbi.nlm.nih.gov/pubmed/23372677 http://dx.doi.org/10.1371/journal.pone.0053943 |
_version_ | 1782256785894146048 |
---|---|
author | Mirshahvalad, Atieh Beauchesne, Olivier H. Archambault, Éric Rosvall, Martin |
author_facet | Mirshahvalad, Atieh Beauchesne, Olivier H. Archambault, Éric Rosvall, Martin |
author_sort | Mirshahvalad, Atieh |
collection | PubMed |
description | Community detection helps us simplify the complex configuration of networks, but communities are reliable only if they are statistically significant. To detect statistically significant communities, a common approach is to resample the original network and analyze the communities. But resampling assumes independence between samples, while the components of a network are inherently dependent. Therefore, we must understand how breaking dependencies between resampled components affects the results of the significance analysis. Here we use scientific communication as a model system to analyze this effect. Our dataset includes citations among articles published in journals in the years 1984–2010. We compare parametric resampling of citations with non-parametric article resampling. While citation resampling breaks link dependencies, article resampling maintains such dependencies. We find that citation resampling underestimates the variance of link weights. Moreover, this underestimation explains most of the differences in the significance analysis of ranking and clustering. Therefore, when only link weights are available and article resampling is not an option, we suggest a simple parametric resampling scheme that generates link-weight variances close to the link-weight variances of article resampling. Nevertheless, when we highlight and summarize important structural changes in science, the more dependencies we can maintain in the resampling scheme, the earlier we can predict structural change. |
format | Online Article Text |
id | pubmed-3553110 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-35531102013-01-31 Resampling Effects on Significance Analysis of Network Clustering and Ranking Mirshahvalad, Atieh Beauchesne, Olivier H. Archambault, Éric Rosvall, Martin PLoS One Research Article Community detection helps us simplify the complex configuration of networks, but communities are reliable only if they are statistically significant. To detect statistically significant communities, a common approach is to resample the original network and analyze the communities. But resampling assumes independence between samples, while the components of a network are inherently dependent. Therefore, we must understand how breaking dependencies between resampled components affects the results of the significance analysis. Here we use scientific communication as a model system to analyze this effect. Our dataset includes citations among articles published in journals in the years 1984–2010. We compare parametric resampling of citations with non-parametric article resampling. While citation resampling breaks link dependencies, article resampling maintains such dependencies. We find that citation resampling underestimates the variance of link weights. Moreover, this underestimation explains most of the differences in the significance analysis of ranking and clustering. Therefore, when only link weights are available and article resampling is not an option, we suggest a simple parametric resampling scheme that generates link-weight variances close to the link-weight variances of article resampling. Nevertheless, when we highlight and summarize important structural changes in science, the more dependencies we can maintain in the resampling scheme, the earlier we can predict structural change. Public Library of Science 2013-01-23 /pmc/articles/PMC3553110/ /pubmed/23372677 http://dx.doi.org/10.1371/journal.pone.0053943 Text en © 2013 Mirshahvalad et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Mirshahvalad, Atieh Beauchesne, Olivier H. Archambault, Éric Rosvall, Martin Resampling Effects on Significance Analysis of Network Clustering and Ranking |
title | Resampling Effects on Significance Analysis of Network Clustering and Ranking |
title_full | Resampling Effects on Significance Analysis of Network Clustering and Ranking |
title_fullStr | Resampling Effects on Significance Analysis of Network Clustering and Ranking |
title_full_unstemmed | Resampling Effects on Significance Analysis of Network Clustering and Ranking |
title_short | Resampling Effects on Significance Analysis of Network Clustering and Ranking |
title_sort | resampling effects on significance analysis of network clustering and ranking |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3553110/ https://www.ncbi.nlm.nih.gov/pubmed/23372677 http://dx.doi.org/10.1371/journal.pone.0053943 |
work_keys_str_mv | AT mirshahvaladatieh resamplingeffectsonsignificanceanalysisofnetworkclusteringandranking AT beauchesneolivierh resamplingeffectsonsignificanceanalysisofnetworkclusteringandranking AT archambaulteric resamplingeffectsonsignificanceanalysisofnetworkclusteringandranking AT rosvallmartin resamplingeffectsonsignificanceanalysisofnetworkclusteringandranking |