Cargando…

Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs

BACKGROUND: Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hartsperger, Mara L, Blöchl, Florian, Stümpflen, Volker, Theis, Fabian J
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2010
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3247861/ https://www.ncbi.nlm.nih.gov/pubmed/20961418 http://dx.doi.org/10.1186/1471-2105-11-522

_version_	1782220181585526784
author	Hartsperger, Mara L Blöchl, Florian Stümpflen, Volker Theis, Fabian J
author_facet	Hartsperger, Mara L Blöchl, Florian Stümpflen, Volker Theis, Fabian J
author_sort	Hartsperger, Mara L
collection	PubMed
description	BACKGROUND: Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general but common situation of k-partite graphs. These graphs contain k different node types and links are only allowed between nodes of different types. In order to reveal their structural organization and describe the contained information in a more coarse-grained fashion, we ask how to detect clusters within each node type. RESULTS: Since entities in biological networks regularly have more than one function and hence participate in more than one cluster, we developed a k-partite graph partitioning algorithm that allows for overlapping (fuzzy) clusters. It determines for each node a degree of membership to each cluster. Moreover, the algorithm estimates a weighted k-partite graph that connects the extracted clusters. Our method is fast and efficient, mimicking the multiplicative update rules commonly employed in algorithms for non-negative matrix factorization. It facilitates the decomposition of networks on a chosen scale and therefore allows for analysis and interpretation of structures on various resolution levels. Applying our algorithm to a tripartite disease-gene-protein complex network, we were able to structure this graph on a large scale into clusters that are functionally correlated and biologically meaningful. Locally, smaller clusters enabled reclassification or annotation of the clusters' elements. We exemplified this for the transcription factor MECP2. CONCLUSIONS: In order to cope with the overwhelming amount of information available from biomedical literature, we need to tackle the challenge of finding structures in large networks with nodes of multiple types. To this end, we presented a novel fuzzy k-partite graph partitioning algorithm that allows the decomposition of these objects in a comprehensive fashion. We validated our approach both on artificial and real-world data. It is readily applicable to any further problem.
format	Online Article Text
id	pubmed-3247861
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-32478612011-12-30 Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs Hartsperger, Mara L Blöchl, Florian Stümpflen, Volker Theis, Fabian J BMC Bioinformatics Methodology Article BACKGROUND: Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general but common situation of k-partite graphs. These graphs contain k different node types and links are only allowed between nodes of different types. In order to reveal their structural organization and describe the contained information in a more coarse-grained fashion, we ask how to detect clusters within each node type. RESULTS: Since entities in biological networks regularly have more than one function and hence participate in more than one cluster, we developed a k-partite graph partitioning algorithm that allows for overlapping (fuzzy) clusters. It determines for each node a degree of membership to each cluster. Moreover, the algorithm estimates a weighted k-partite graph that connects the extracted clusters. Our method is fast and efficient, mimicking the multiplicative update rules commonly employed in algorithms for non-negative matrix factorization. It facilitates the decomposition of networks on a chosen scale and therefore allows for analysis and interpretation of structures on various resolution levels. Applying our algorithm to a tripartite disease-gene-protein complex network, we were able to structure this graph on a large scale into clusters that are functionally correlated and biologically meaningful. Locally, smaller clusters enabled reclassification or annotation of the clusters' elements. We exemplified this for the transcription factor MECP2. CONCLUSIONS: In order to cope with the overwhelming amount of information available from biomedical literature, we need to tackle the challenge of finding structures in large networks with nodes of multiple types. To this end, we presented a novel fuzzy k-partite graph partitioning algorithm that allows the decomposition of these objects in a comprehensive fashion. We validated our approach both on artificial and real-world data. It is readily applicable to any further problem. BioMed Central 2010-10-20 /pmc/articles/PMC3247861/ /pubmed/20961418 http://dx.doi.org/10.1186/1471-2105-11-522 Text en Copyright ©2010 Hartsperger et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Hartsperger, Mara L Blöchl, Florian Stümpflen, Volker Theis, Fabian J Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs
title	Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs
title_full	Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs
title_fullStr	Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs
title_full_unstemmed	Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs
title_short	Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs
title_sort	structuring heterogeneous biological information using fuzzy clustering of k-partite graphs
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3247861/ https://www.ncbi.nlm.nih.gov/pubmed/20961418 http://dx.doi.org/10.1186/1471-2105-11-522
work_keys_str_mv	AT hartspergermaral structuringheterogeneousbiologicalinformationusingfuzzyclusteringofkpartitegraphs AT blochlflorian structuringheterogeneousbiologicalinformationusingfuzzyclusteringofkpartitegraphs AT stumpflenvolker structuringheterogeneousbiologicalinformationusingfuzzyclusteringofkpartitegraphs AT theisfabianj structuringheterogeneousbiologicalinformationusingfuzzyclusteringofkpartitegraphs

Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs

Ejemplares similares