Cargando…

Hierarchical community detection via rank-2 symmetric nonnegative matrix factorization

BACKGROUND: Community discovery is an important task for revealing structures in large networks. The massive size of contemporary social networks poses a tremendous challenge to the scalability of traditional graph clustering algorithms and the evaluation of discovered communities. METHODS: We propo...

Descripción completa

Detalles Bibliográficos
Autores principales: Du, Rundong, Kuang, Da, Drake, Barry, Park, Haesun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5732610/
https://www.ncbi.nlm.nih.gov/pubmed/29266136
http://dx.doi.org/10.1186/s40649-017-0043-5
_version_ 1783286737572724736
author Du, Rundong
Kuang, Da
Drake, Barry
Park, Haesun
author_facet Du, Rundong
Kuang, Da
Drake, Barry
Park, Haesun
author_sort Du, Rundong
collection PubMed
description BACKGROUND: Community discovery is an important task for revealing structures in large networks. The massive size of contemporary social networks poses a tremendous challenge to the scalability of traditional graph clustering algorithms and the evaluation of discovered communities. METHODS: We propose a divide-and-conquer strategy to discover hierarchical community structure, nonoverlapping within each level. Our algorithm is based on the highly efficient rank-2 symmetric nonnegative matrix factorization. We solve several implementation challenges to boost its efficiency on modern computer architectures, specifically for very sparse adjacency matrices that represent a wide range of social networks. CONCLUSIONS: Empirical results have shown that our algorithm has competitive overall efficiency and leading performance in minimizing the average normalized cut, and that the nonoverlapping communities found by our algorithm recover the ground-truth communities better than state-of-the-art algorithms for overlapping community detection. In addition, we present a new dataset of the DBLP computer science bibliography network with richer meta-data and verifiable ground-truth knowledge, which can foster future research in community finding and interpretation of communities in large networks.
format Online
Article
Text
id pubmed-5732610
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-57326102017-12-18 Hierarchical community detection via rank-2 symmetric nonnegative matrix factorization Du, Rundong Kuang, Da Drake, Barry Park, Haesun Comput Soc Netw Research BACKGROUND: Community discovery is an important task for revealing structures in large networks. The massive size of contemporary social networks poses a tremendous challenge to the scalability of traditional graph clustering algorithms and the evaluation of discovered communities. METHODS: We propose a divide-and-conquer strategy to discover hierarchical community structure, nonoverlapping within each level. Our algorithm is based on the highly efficient rank-2 symmetric nonnegative matrix factorization. We solve several implementation challenges to boost its efficiency on modern computer architectures, specifically for very sparse adjacency matrices that represent a wide range of social networks. CONCLUSIONS: Empirical results have shown that our algorithm has competitive overall efficiency and leading performance in minimizing the average normalized cut, and that the nonoverlapping communities found by our algorithm recover the ground-truth communities better than state-of-the-art algorithms for overlapping community detection. In addition, we present a new dataset of the DBLP computer science bibliography network with richer meta-data and verifiable ground-truth knowledge, which can foster future research in community finding and interpretation of communities in large networks. Springer International Publishing 2017-09-08 2017 /pmc/articles/PMC5732610/ /pubmed/29266136 http://dx.doi.org/10.1186/s40649-017-0043-5 Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle Research
Du, Rundong
Kuang, Da
Drake, Barry
Park, Haesun
Hierarchical community detection via rank-2 symmetric nonnegative matrix factorization
title Hierarchical community detection via rank-2 symmetric nonnegative matrix factorization
title_full Hierarchical community detection via rank-2 symmetric nonnegative matrix factorization
title_fullStr Hierarchical community detection via rank-2 symmetric nonnegative matrix factorization
title_full_unstemmed Hierarchical community detection via rank-2 symmetric nonnegative matrix factorization
title_short Hierarchical community detection via rank-2 symmetric nonnegative matrix factorization
title_sort hierarchical community detection via rank-2 symmetric nonnegative matrix factorization
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5732610/
https://www.ncbi.nlm.nih.gov/pubmed/29266136
http://dx.doi.org/10.1186/s40649-017-0043-5
work_keys_str_mv AT durundong hierarchicalcommunitydetectionviarank2symmetricnonnegativematrixfactorization
AT kuangda hierarchicalcommunitydetectionviarank2symmetricnonnegativematrixfactorization
AT drakebarry hierarchicalcommunitydetectionviarank2symmetricnonnegativematrixfactorization
AT parkhaesun hierarchicalcommunitydetectionviarank2symmetricnonnegativematrixfactorization