Cargando…
Identifying robust communities and multi-community nodes by combining top-down and bottom-up approaches to clustering
Biological functions are carried out by groups of interacting molecules, cells or tissues, known as communities. Membership in these communities may overlap when biological components are involved in multiple functions. However, traditional clustering methods detect non-overlapping communities. Thes...
Autores principales: | , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4637843/ https://www.ncbi.nlm.nih.gov/pubmed/26549511 http://dx.doi.org/10.1038/srep16361 |
_version_ | 1782399838862704640 |
---|---|
author | Gaiteri, Chris Chen, Mingming Szymanski, Boleslaw Kuzmin, Konstantin Xie, Jierui Lee, Changkyu Blanche, Timothy Chaibub Neto, Elias Huang, Su-Chun Grabowski, Thomas Madhyastha, Tara Komashko, Vitalina |
author_facet | Gaiteri, Chris Chen, Mingming Szymanski, Boleslaw Kuzmin, Konstantin Xie, Jierui Lee, Changkyu Blanche, Timothy Chaibub Neto, Elias Huang, Su-Chun Grabowski, Thomas Madhyastha, Tara Komashko, Vitalina |
author_sort | Gaiteri, Chris |
collection | PubMed |
description | Biological functions are carried out by groups of interacting molecules, cells or tissues, known as communities. Membership in these communities may overlap when biological components are involved in multiple functions. However, traditional clustering methods detect non-overlapping communities. These detected communities may also be unstable and difficult to replicate, because traditional methods are sensitive to noise and parameter settings. These aspects of traditional clustering methods limit our ability to detect biological communities, and therefore our ability to understand biological functions. To address these limitations and detect robust overlapping biological communities, we propose an unorthodox clustering method called SpeakEasy which identifies communities using top-down and bottom-up approaches simultaneously. Specifically, nodes join communities based on their local connections, as well as global information about the network structure. This method can quantify the stability of each community, automatically identify the number of communities, and quickly cluster networks with hundreds of thousands of nodes. SpeakEasy shows top performance on synthetic clustering benchmarks and accurately identifies meaningful biological communities in a range of datasets, including: gene microarrays, protein interactions, sorted cell populations, electrophysiology and fMRI brain imaging. |
format | Online Article Text |
id | pubmed-4637843 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-46378432015-11-30 Identifying robust communities and multi-community nodes by combining top-down and bottom-up approaches to clustering Gaiteri, Chris Chen, Mingming Szymanski, Boleslaw Kuzmin, Konstantin Xie, Jierui Lee, Changkyu Blanche, Timothy Chaibub Neto, Elias Huang, Su-Chun Grabowski, Thomas Madhyastha, Tara Komashko, Vitalina Sci Rep Article Biological functions are carried out by groups of interacting molecules, cells or tissues, known as communities. Membership in these communities may overlap when biological components are involved in multiple functions. However, traditional clustering methods detect non-overlapping communities. These detected communities may also be unstable and difficult to replicate, because traditional methods are sensitive to noise and parameter settings. These aspects of traditional clustering methods limit our ability to detect biological communities, and therefore our ability to understand biological functions. To address these limitations and detect robust overlapping biological communities, we propose an unorthodox clustering method called SpeakEasy which identifies communities using top-down and bottom-up approaches simultaneously. Specifically, nodes join communities based on their local connections, as well as global information about the network structure. This method can quantify the stability of each community, automatically identify the number of communities, and quickly cluster networks with hundreds of thousands of nodes. SpeakEasy shows top performance on synthetic clustering benchmarks and accurately identifies meaningful biological communities in a range of datasets, including: gene microarrays, protein interactions, sorted cell populations, electrophysiology and fMRI brain imaging. Nature Publishing Group 2015-11-09 /pmc/articles/PMC4637843/ /pubmed/26549511 http://dx.doi.org/10.1038/srep16361 Text en Copyright © 2015, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Article Gaiteri, Chris Chen, Mingming Szymanski, Boleslaw Kuzmin, Konstantin Xie, Jierui Lee, Changkyu Blanche, Timothy Chaibub Neto, Elias Huang, Su-Chun Grabowski, Thomas Madhyastha, Tara Komashko, Vitalina Identifying robust communities and multi-community nodes by combining top-down and bottom-up approaches to clustering |
title | Identifying robust communities and multi-community nodes by combining top-down and bottom-up approaches to clustering |
title_full | Identifying robust communities and multi-community nodes by combining top-down and bottom-up approaches to clustering |
title_fullStr | Identifying robust communities and multi-community nodes by combining top-down and bottom-up approaches to clustering |
title_full_unstemmed | Identifying robust communities and multi-community nodes by combining top-down and bottom-up approaches to clustering |
title_short | Identifying robust communities and multi-community nodes by combining top-down and bottom-up approaches to clustering |
title_sort | identifying robust communities and multi-community nodes by combining top-down and bottom-up approaches to clustering |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4637843/ https://www.ncbi.nlm.nih.gov/pubmed/26549511 http://dx.doi.org/10.1038/srep16361 |
work_keys_str_mv | AT gaiterichris identifyingrobustcommunitiesandmulticommunitynodesbycombiningtopdownandbottomupapproachestoclustering AT chenmingming identifyingrobustcommunitiesandmulticommunitynodesbycombiningtopdownandbottomupapproachestoclustering AT szymanskiboleslaw identifyingrobustcommunitiesandmulticommunitynodesbycombiningtopdownandbottomupapproachestoclustering AT kuzminkonstantin identifyingrobustcommunitiesandmulticommunitynodesbycombiningtopdownandbottomupapproachestoclustering AT xiejierui identifyingrobustcommunitiesandmulticommunitynodesbycombiningtopdownandbottomupapproachestoclustering AT leechangkyu identifyingrobustcommunitiesandmulticommunitynodesbycombiningtopdownandbottomupapproachestoclustering AT blanchetimothy identifyingrobustcommunitiesandmulticommunitynodesbycombiningtopdownandbottomupapproachestoclustering AT chaibubnetoelias identifyingrobustcommunitiesandmulticommunitynodesbycombiningtopdownandbottomupapproachestoclustering AT huangsuchun identifyingrobustcommunitiesandmulticommunitynodesbycombiningtopdownandbottomupapproachestoclustering AT grabowskithomas identifyingrobustcommunitiesandmulticommunitynodesbycombiningtopdownandbottomupapproachestoclustering AT madhyasthatara identifyingrobustcommunitiesandmulticommunitynodesbycombiningtopdownandbottomupapproachestoclustering AT komashkovitalina identifyingrobustcommunitiesandmulticommunitynodesbycombiningtopdownandbottomupapproachestoclustering |