Cargando…

Fully automated protein complex prediction based on topological similarity and community structure

To understand the function of protein complexes and their association with biological processes, a lot of studies have been done towards analyzing the protein-protein interaction (PPI) networks. However, the advancement in high-throughput technology has resulted in a humongous amount of data for ana...

Descripción completa

Detalles Bibliográficos
Autores principales: Lei, Chengwei, Tamim, Saleh, Bishop, Alexander JR, Ruan, Jianhua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3908383/
https://www.ncbi.nlm.nih.gov/pubmed/24564887
http://dx.doi.org/10.1186/1477-5956-11-S1-S9
_version_ 1782301706938220544
author Lei, Chengwei
Tamim, Saleh
Bishop, Alexander JR
Ruan, Jianhua
author_facet Lei, Chengwei
Tamim, Saleh
Bishop, Alexander JR
Ruan, Jianhua
author_sort Lei, Chengwei
collection PubMed
description To understand the function of protein complexes and their association with biological processes, a lot of studies have been done towards analyzing the protein-protein interaction (PPI) networks. However, the advancement in high-throughput technology has resulted in a humongous amount of data for analysis. Moreover, high level of noise, sparseness, and skewness in degree distribution of PPI networks limits the performance of many clustering algorithms and further analysis of their interactions. In addressing and solving these problems we present a novel random walk based algorithm that converts the incomplete and binary PPI network into a protein-protein topological similarity matrix (PP-TS matrix). We believe that if two proteins share some high-order topological similarities they are likely to be interacting with each other. Using the obtained PP-TS matrix, we constructed and used weighted networks to further study and analyze the interaction among proteins. Specifically, we applied a fully automated community structure finding algorithm (Auto-HQcut) on the obtained weighted network to cluster protein complexes. We then analyzed the protein complexes for significance in biological processes. To help visualize and analyze these protein complexes we also developed an interface that displays the resulting complexes as well as the characteristics associated with each complex. Applying our approach to a yeast protein-protein interaction network, we found that the predicted protein-protein interaction pairs with high topological similarities have more significant biological relevance than the original protein-protein interactions pairs. When we compared our PPI network reconstruction algorithm with other existing algorithms using gene ontology and gene co-expression, our algorithm produced the highest similarity scores. Also, our predicted protein complexes showed higher accuracy measure compared to the other protein complex predictions.
format Online
Article
Text
id pubmed-3908383
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-39083832014-02-13 Fully automated protein complex prediction based on topological similarity and community structure Lei, Chengwei Tamim, Saleh Bishop, Alexander JR Ruan, Jianhua Proteome Sci Research To understand the function of protein complexes and their association with biological processes, a lot of studies have been done towards analyzing the protein-protein interaction (PPI) networks. However, the advancement in high-throughput technology has resulted in a humongous amount of data for analysis. Moreover, high level of noise, sparseness, and skewness in degree distribution of PPI networks limits the performance of many clustering algorithms and further analysis of their interactions. In addressing and solving these problems we present a novel random walk based algorithm that converts the incomplete and binary PPI network into a protein-protein topological similarity matrix (PP-TS matrix). We believe that if two proteins share some high-order topological similarities they are likely to be interacting with each other. Using the obtained PP-TS matrix, we constructed and used weighted networks to further study and analyze the interaction among proteins. Specifically, we applied a fully automated community structure finding algorithm (Auto-HQcut) on the obtained weighted network to cluster protein complexes. We then analyzed the protein complexes for significance in biological processes. To help visualize and analyze these protein complexes we also developed an interface that displays the resulting complexes as well as the characteristics associated with each complex. Applying our approach to a yeast protein-protein interaction network, we found that the predicted protein-protein interaction pairs with high topological similarities have more significant biological relevance than the original protein-protein interactions pairs. When we compared our PPI network reconstruction algorithm with other existing algorithms using gene ontology and gene co-expression, our algorithm produced the highest similarity scores. Also, our predicted protein complexes showed higher accuracy measure compared to the other protein complex predictions. BioMed Central 2013-11-07 /pmc/articles/PMC3908383/ /pubmed/24564887 http://dx.doi.org/10.1186/1477-5956-11-S1-S9 Text en Copyright © 2013 Lei et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Lei, Chengwei
Tamim, Saleh
Bishop, Alexander JR
Ruan, Jianhua
Fully automated protein complex prediction based on topological similarity and community structure
title Fully automated protein complex prediction based on topological similarity and community structure
title_full Fully automated protein complex prediction based on topological similarity and community structure
title_fullStr Fully automated protein complex prediction based on topological similarity and community structure
title_full_unstemmed Fully automated protein complex prediction based on topological similarity and community structure
title_short Fully automated protein complex prediction based on topological similarity and community structure
title_sort fully automated protein complex prediction based on topological similarity and community structure
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3908383/
https://www.ncbi.nlm.nih.gov/pubmed/24564887
http://dx.doi.org/10.1186/1477-5956-11-S1-S9
work_keys_str_mv AT leichengwei fullyautomatedproteincomplexpredictionbasedontopologicalsimilarityandcommunitystructure
AT tamimsaleh fullyautomatedproteincomplexpredictionbasedontopologicalsimilarityandcommunitystructure
AT bishopalexanderjr fullyautomatedproteincomplexpredictionbasedontopologicalsimilarityandcommunitystructure
AT ruanjianhua fullyautomatedproteincomplexpredictionbasedontopologicalsimilarityandcommunitystructure