Cargando…

A Monte Carlo Evaluation of Weighted Community Detection Algorithms

The past decade has been marked with a proliferation of community detection algorithms that aim to organize nodes (e.g., individuals, brain regions, variables) into modular structures that indicate subgroups, clusters, or communities. Motivated by the emergence of big data across many fields of inqu...

Descripción completa

Detalles Bibliográficos
Autores principales: Gates, Kathleen M., Henry, Teague, Steinley, Doug, Fair, Damien A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5102890/
https://www.ncbi.nlm.nih.gov/pubmed/27891087
http://dx.doi.org/10.3389/fninf.2016.00045
_version_ 1782466497060274176
author Gates, Kathleen M.
Henry, Teague
Steinley, Doug
Fair, Damien A.
author_facet Gates, Kathleen M.
Henry, Teague
Steinley, Doug
Fair, Damien A.
author_sort Gates, Kathleen M.
collection PubMed
description The past decade has been marked with a proliferation of community detection algorithms that aim to organize nodes (e.g., individuals, brain regions, variables) into modular structures that indicate subgroups, clusters, or communities. Motivated by the emergence of big data across many fields of inquiry, these methodological developments have primarily focused on the detection of communities of nodes from matrices that are very large. However, it remains unknown if the algorithms can reliably detect communities in smaller graph sizes (i.e., 1000 nodes and fewer) which are commonly used in brain research. More importantly, these algorithms have predominantly been tested only on binary or sparse count matrices and it remains unclear the degree to which the algorithms can recover community structure for different types of matrices, such as the often used cross-correlation matrices representing functional connectivity across predefined brain regions. Of the publicly available approaches for weighted graphs that can detect communities in graph sizes of at least 1000, prior research has demonstrated that Newman's spectral approach (i.e., Leading Eigenvalue), Walktrap, Fast Modularity, the Louvain method (i.e., multilevel community method), Label Propagation, and Infomap all recover communities exceptionally well in certain circumstances. The purpose of the present Monte Carlo simulation study is to test these methods across a large number of conditions, including varied graph sizes and types of matrix (sparse count, correlation, and reflected Euclidean distance), to identify which algorithm is optimal for specific types of data matrices. The results indicate that when the data are in the form of sparse count networks (such as those seen in diffusion tensor imaging), Label Propagation and Walktrap surfaced as the most reliable methods for community detection. For dense, weighted networks such as correlation matrices capturing functional connectivity, Walktrap consistently outperformed the other approaches for recovering communities.
format Online
Article
Text
id pubmed-5102890
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-51028902016-11-25 A Monte Carlo Evaluation of Weighted Community Detection Algorithms Gates, Kathleen M. Henry, Teague Steinley, Doug Fair, Damien A. Front Neuroinform Neuroscience The past decade has been marked with a proliferation of community detection algorithms that aim to organize nodes (e.g., individuals, brain regions, variables) into modular structures that indicate subgroups, clusters, or communities. Motivated by the emergence of big data across many fields of inquiry, these methodological developments have primarily focused on the detection of communities of nodes from matrices that are very large. However, it remains unknown if the algorithms can reliably detect communities in smaller graph sizes (i.e., 1000 nodes and fewer) which are commonly used in brain research. More importantly, these algorithms have predominantly been tested only on binary or sparse count matrices and it remains unclear the degree to which the algorithms can recover community structure for different types of matrices, such as the often used cross-correlation matrices representing functional connectivity across predefined brain regions. Of the publicly available approaches for weighted graphs that can detect communities in graph sizes of at least 1000, prior research has demonstrated that Newman's spectral approach (i.e., Leading Eigenvalue), Walktrap, Fast Modularity, the Louvain method (i.e., multilevel community method), Label Propagation, and Infomap all recover communities exceptionally well in certain circumstances. The purpose of the present Monte Carlo simulation study is to test these methods across a large number of conditions, including varied graph sizes and types of matrix (sparse count, correlation, and reflected Euclidean distance), to identify which algorithm is optimal for specific types of data matrices. The results indicate that when the data are in the form of sparse count networks (such as those seen in diffusion tensor imaging), Label Propagation and Walktrap surfaced as the most reliable methods for community detection. For dense, weighted networks such as correlation matrices capturing functional connectivity, Walktrap consistently outperformed the other approaches for recovering communities. Frontiers Media S.A. 2016-11-10 /pmc/articles/PMC5102890/ /pubmed/27891087 http://dx.doi.org/10.3389/fninf.2016.00045 Text en Copyright © 2016 Gates, Henry, Steinley and Fair. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Gates, Kathleen M.
Henry, Teague
Steinley, Doug
Fair, Damien A.
A Monte Carlo Evaluation of Weighted Community Detection Algorithms
title A Monte Carlo Evaluation of Weighted Community Detection Algorithms
title_full A Monte Carlo Evaluation of Weighted Community Detection Algorithms
title_fullStr A Monte Carlo Evaluation of Weighted Community Detection Algorithms
title_full_unstemmed A Monte Carlo Evaluation of Weighted Community Detection Algorithms
title_short A Monte Carlo Evaluation of Weighted Community Detection Algorithms
title_sort monte carlo evaluation of weighted community detection algorithms
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5102890/
https://www.ncbi.nlm.nih.gov/pubmed/27891087
http://dx.doi.org/10.3389/fninf.2016.00045
work_keys_str_mv AT gateskathleenm amontecarloevaluationofweightedcommunitydetectionalgorithms
AT henryteague amontecarloevaluationofweightedcommunitydetectionalgorithms
AT steinleydoug amontecarloevaluationofweightedcommunitydetectionalgorithms
AT fairdamiena amontecarloevaluationofweightedcommunitydetectionalgorithms
AT gateskathleenm montecarloevaluationofweightedcommunitydetectionalgorithms
AT henryteague montecarloevaluationofweightedcommunitydetectionalgorithms
AT steinleydoug montecarloevaluationofweightedcommunitydetectionalgorithms
AT fairdamiena montecarloevaluationofweightedcommunitydetectionalgorithms