Cargando…
Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs
BACKGROUND: Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3247861/ https://www.ncbi.nlm.nih.gov/pubmed/20961418 http://dx.doi.org/10.1186/1471-2105-11-522 |
_version_ | 1782220181585526784 |
---|---|
author | Hartsperger, Mara L Blöchl, Florian Stümpflen, Volker Theis, Fabian J |
author_facet | Hartsperger, Mara L Blöchl, Florian Stümpflen, Volker Theis, Fabian J |
author_sort | Hartsperger, Mara L |
collection | PubMed |
description | BACKGROUND: Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general but common situation of k-partite graphs. These graphs contain k different node types and links are only allowed between nodes of different types. In order to reveal their structural organization and describe the contained information in a more coarse-grained fashion, we ask how to detect clusters within each node type. RESULTS: Since entities in biological networks regularly have more than one function and hence participate in more than one cluster, we developed a k-partite graph partitioning algorithm that allows for overlapping (fuzzy) clusters. It determines for each node a degree of membership to each cluster. Moreover, the algorithm estimates a weighted k-partite graph that connects the extracted clusters. Our method is fast and efficient, mimicking the multiplicative update rules commonly employed in algorithms for non-negative matrix factorization. It facilitates the decomposition of networks on a chosen scale and therefore allows for analysis and interpretation of structures on various resolution levels. Applying our algorithm to a tripartite disease-gene-protein complex network, we were able to structure this graph on a large scale into clusters that are functionally correlated and biologically meaningful. Locally, smaller clusters enabled reclassification or annotation of the clusters' elements. We exemplified this for the transcription factor MECP2. CONCLUSIONS: In order to cope with the overwhelming amount of information available from biomedical literature, we need to tackle the challenge of finding structures in large networks with nodes of multiple types. To this end, we presented a novel fuzzy k-partite graph partitioning algorithm that allows the decomposition of these objects in a comprehensive fashion. We validated our approach both on artificial and real-world data. It is readily applicable to any further problem. |
format | Online Article Text |
id | pubmed-3247861 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32478612011-12-30 Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs Hartsperger, Mara L Blöchl, Florian Stümpflen, Volker Theis, Fabian J BMC Bioinformatics Methodology Article BACKGROUND: Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general but common situation of k-partite graphs. These graphs contain k different node types and links are only allowed between nodes of different types. In order to reveal their structural organization and describe the contained information in a more coarse-grained fashion, we ask how to detect clusters within each node type. RESULTS: Since entities in biological networks regularly have more than one function and hence participate in more than one cluster, we developed a k-partite graph partitioning algorithm that allows for overlapping (fuzzy) clusters. It determines for each node a degree of membership to each cluster. Moreover, the algorithm estimates a weighted k-partite graph that connects the extracted clusters. Our method is fast and efficient, mimicking the multiplicative update rules commonly employed in algorithms for non-negative matrix factorization. It facilitates the decomposition of networks on a chosen scale and therefore allows for analysis and interpretation of structures on various resolution levels. Applying our algorithm to a tripartite disease-gene-protein complex network, we were able to structure this graph on a large scale into clusters that are functionally correlated and biologically meaningful. Locally, smaller clusters enabled reclassification or annotation of the clusters' elements. We exemplified this for the transcription factor MECP2. CONCLUSIONS: In order to cope with the overwhelming amount of information available from biomedical literature, we need to tackle the challenge of finding structures in large networks with nodes of multiple types. To this end, we presented a novel fuzzy k-partite graph partitioning algorithm that allows the decomposition of these objects in a comprehensive fashion. We validated our approach both on artificial and real-world data. It is readily applicable to any further problem. BioMed Central 2010-10-20 /pmc/articles/PMC3247861/ /pubmed/20961418 http://dx.doi.org/10.1186/1471-2105-11-522 Text en Copyright ©2010 Hartsperger et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Hartsperger, Mara L Blöchl, Florian Stümpflen, Volker Theis, Fabian J Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs |
title | Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs |
title_full | Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs |
title_fullStr | Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs |
title_full_unstemmed | Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs |
title_short | Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs |
title_sort | structuring heterogeneous biological information using fuzzy clustering of k-partite graphs |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3247861/ https://www.ncbi.nlm.nih.gov/pubmed/20961418 http://dx.doi.org/10.1186/1471-2105-11-522 |
work_keys_str_mv | AT hartspergermaral structuringheterogeneousbiologicalinformationusingfuzzyclusteringofkpartitegraphs AT blochlflorian structuringheterogeneousbiologicalinformationusingfuzzyclusteringofkpartitegraphs AT stumpflenvolker structuringheterogeneousbiologicalinformationusingfuzzyclusteringofkpartitegraphs AT theisfabianj structuringheterogeneousbiologicalinformationusingfuzzyclusteringofkpartitegraphs |