Cargando…

Protein complex prediction for large protein protein interaction networks with the Core&Peel method

BACKGROUND: Biological networks play an increasingly important role in the exploration of functional modularity and cellular organization at a systemic level. Quite often the first tools used to analyze these networks are clustering algorithms. We concentrate here on the specific task of predicting...

Descripción completa

Detalles Bibliográficos
Autores principales: Pellegrini, Marco, Baglioni, Miriam, Geraci, Filippo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5123419/
https://www.ncbi.nlm.nih.gov/pubmed/28185552
http://dx.doi.org/10.1186/s12859-016-1191-6
_version_ 1782469733715542016
author Pellegrini, Marco
Baglioni, Miriam
Geraci, Filippo
author_facet Pellegrini, Marco
Baglioni, Miriam
Geraci, Filippo
author_sort Pellegrini, Marco
collection PubMed
description BACKGROUND: Biological networks play an increasingly important role in the exploration of functional modularity and cellular organization at a systemic level. Quite often the first tools used to analyze these networks are clustering algorithms. We concentrate here on the specific task of predicting protein complexes (PC) in large protein-protein interaction networks (PPIN). Currently, many state-of-the-art algorithms work well for networks of small or moderate size. However, their performance on much larger networks, which are becoming increasingly common in modern proteome-wise studies, needs to be re-assessed. RESULTS AND DISCUSSION: We present a new fast algorithm for clustering large sparse networks: Core&Peel, which runs essentially in time and storage O(a(G)m+n) for a network G of n nodes and m arcs, where a(G) is the arboricity of G (which is roughly proportional to the maximum average degree of any induced subgraph in G). We evaluated Core&Peel on five PPI networks of large size and one of medium size from both yeast and homo sapiens, comparing its performance against those of ten state-of-the-art methods. We demonstrate that Core&Peel consistently outperforms the ten competitors in its ability to identify known protein complexes and in the functional coherence of its predictions. Our method is remarkably robust, being quite insensible to the injection of random interactions. Core&Peel is also empirically efficient attaining the second best running time over large networks among the tested algorithms. CONCLUSIONS: Our algorithm Core&Peel pushes forward the state-of the-art in PPIN clustering providing an algorithmic solution with polynomial running time that attains experimentally demonstrable good output quality and speed on challenging large real networks. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1191-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5123419
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-51234192016-12-08 Protein complex prediction for large protein protein interaction networks with the Core&Peel method Pellegrini, Marco Baglioni, Miriam Geraci, Filippo BMC Bioinformatics Research BACKGROUND: Biological networks play an increasingly important role in the exploration of functional modularity and cellular organization at a systemic level. Quite often the first tools used to analyze these networks are clustering algorithms. We concentrate here on the specific task of predicting protein complexes (PC) in large protein-protein interaction networks (PPIN). Currently, many state-of-the-art algorithms work well for networks of small or moderate size. However, their performance on much larger networks, which are becoming increasingly common in modern proteome-wise studies, needs to be re-assessed. RESULTS AND DISCUSSION: We present a new fast algorithm for clustering large sparse networks: Core&Peel, which runs essentially in time and storage O(a(G)m+n) for a network G of n nodes and m arcs, where a(G) is the arboricity of G (which is roughly proportional to the maximum average degree of any induced subgraph in G). We evaluated Core&Peel on five PPI networks of large size and one of medium size from both yeast and homo sapiens, comparing its performance against those of ten state-of-the-art methods. We demonstrate that Core&Peel consistently outperforms the ten competitors in its ability to identify known protein complexes and in the functional coherence of its predictions. Our method is remarkably robust, being quite insensible to the injection of random interactions. Core&Peel is also empirically efficient attaining the second best running time over large networks among the tested algorithms. CONCLUSIONS: Our algorithm Core&Peel pushes forward the state-of the-art in PPIN clustering providing an algorithmic solution with polynomial running time that attains experimentally demonstrable good output quality and speed on challenging large real networks. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1191-6) contains supplementary material, which is available to authorized users. BioMed Central 2016-11-08 /pmc/articles/PMC5123419/ /pubmed/28185552 http://dx.doi.org/10.1186/s12859-016-1191-6 Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Pellegrini, Marco
Baglioni, Miriam
Geraci, Filippo
Protein complex prediction for large protein protein interaction networks with the Core&Peel method
title Protein complex prediction for large protein protein interaction networks with the Core&Peel method
title_full Protein complex prediction for large protein protein interaction networks with the Core&Peel method
title_fullStr Protein complex prediction for large protein protein interaction networks with the Core&Peel method
title_full_unstemmed Protein complex prediction for large protein protein interaction networks with the Core&Peel method
title_short Protein complex prediction for large protein protein interaction networks with the Core&Peel method
title_sort protein complex prediction for large protein protein interaction networks with the core&peel method
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5123419/
https://www.ncbi.nlm.nih.gov/pubmed/28185552
http://dx.doi.org/10.1186/s12859-016-1191-6
work_keys_str_mv AT pellegrinimarco proteincomplexpredictionforlargeproteinproteininteractionnetworkswiththecorepeelmethod
AT baglionimiriam proteincomplexpredictionforlargeproteinproteininteractionnetworkswiththecorepeelmethod
AT geracifilippo proteincomplexpredictionforlargeproteinproteininteractionnetworkswiththecorepeelmethod