Cargando…

Protein complex detection in PPI networks based on data integration and supervised learning method

BACKGROUND: Revealing protein complexes are important for understanding principles of cellular organization and function. High-throughput experimental techniques have produced a large amount of protein interactions, which makes it possible to predict protein complexes from protein-protein interactio...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Feng Ying, Yang, Zhi Hao, Hu, Xiao Hua, Sun, Yuan Yuan, Lin, Hong Fei, Wang, Jian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4705505/
https://www.ncbi.nlm.nih.gov/pubmed/26329886
http://dx.doi.org/10.1186/1471-2105-16-S12-S3
Descripción
Sumario:BACKGROUND: Revealing protein complexes are important for understanding principles of cellular organization and function. High-throughput experimental techniques have produced a large amount of protein interactions, which makes it possible to predict protein complexes from protein-protein interaction (PPI) networks. However, the small amount of known physical interactions may limit protein complex detection. METHODS: The new PPI networks are constructed by integrating PPI datasets with the large and readily available PPI data from biomedical literature, and then the less reliable PPI between two proteins are filtered out based on semantic similarity and topological similarity of the two proteins. Finally, the supervised learning protein complex detection (SLPC), which can make full use of the information of available known complexes, is applied to detect protein complex on the new PPI networks. RESULTS: The experimental results of SLPC on two different categories yeast PPI networks demonstrate effectiveness of the approach: compared with the original PPI networks, the best average improvements of 4.76, 6.81 and 15.75 percentage units in the F-score, accuracy and maximum matching ratio (MMR) are achieved respectively; compared with the denoising PPI networks, the best average improvements of 3.91, 4.61 and 12.10 percentage units in the F-score, accuracy and MMR are achieved respectively; compared with ClusterONE, the start-of the-art complex detection method, on the denoising extended PPI networks, the average improvements of 26.02 and 22.40 percentage units in the F-score and MMR are achieved respectively. CONCLUSIONS: The experimental results show that the performances of SLPC have a large improvement through integration of new receivable PPI data from biomedical literature into original PPI networks and denoising PPI networks. In addition, our protein complexes detection method can achieve better performance than ClusterONE.