Cargando…

A density-based clustering approach for identifying overlapping protein complexes with functional preferences

BACKGROUND: Identifying protein complexes is an essential task for understanding the mechanisms of proteins in cells. Many computational approaches have thus been developed to identify protein complexes in protein-protein interaction (PPI) networks. Regarding the information that can be adopted by c...

Descripción completa

Detalles Bibliográficos
Autores principales: Hu, Lun, Chan, Keith CC
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4445992/
https://www.ncbi.nlm.nih.gov/pubmed/26013799
http://dx.doi.org/10.1186/s12859-015-0583-3
_version_ 1782373355137007616
author Hu, Lun
Chan, Keith CC
author_facet Hu, Lun
Chan, Keith CC
author_sort Hu, Lun
collection PubMed
description BACKGROUND: Identifying protein complexes is an essential task for understanding the mechanisms of proteins in cells. Many computational approaches have thus been developed to identify protein complexes in protein-protein interaction (PPI) networks. Regarding the information that can be adopted by computational approaches to identify protein complexes, in addition to the graph topology of PPI network, the consideration of functional information of proteins has been becoming popular recently. Relevant approaches perform their tasks by relying on the idea that proteins in the same protein complex may be associated with similar functional information. However, we note from our previous researches that for most protein complexes their proteins are only similar in specific subsets of categories of functional information instead of the entire set. Hence, if the preference of each functional category can also be taken into account when identifying protein complexes, the accuracy will be improved. RESULTS: To implement the idea, we first introduce a preference vector for each of proteins to quantitatively indicate the preference of each functional category when deciding the protein complex this protein belongs to. Integrating functional preferences of proteins and the graph topology of PPI network, we formulate the problem of identifying protein complexes into a constrained optimization problem, and we propose the approach DCAFP to address it. For performance evaluation, we have conducted extensive experiments with several PPI networks from the species of Saccharomyces cerevisiae and Human and also compared DCAFP with state-of-the-art approaches in the identification of protein complexes. The experimental results show that considering the integration of functional preferences and dense structures improved the performance of identifying protein complexes, as DCAFP outperformed the other approaches for most of PPI networks based on the assessments of independent measures of f-measure, Accuracy and Maximum Matching Rate. Furthermore, the function enrichment experiments indicated that DCAFP identified more protein complexes with functional significance when compared with approaches, such as PCIA, that also utilize the functional information. CONCLUSIONS: According to the promising performance of DCAFP, the integration of functional preferences and dense structures has made it possible to identify protein complexes more accurately and significantly.
format Online
Article
Text
id pubmed-4445992
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44459922015-05-28 A density-based clustering approach for identifying overlapping protein complexes with functional preferences Hu, Lun Chan, Keith CC BMC Bioinformatics Methodology Article BACKGROUND: Identifying protein complexes is an essential task for understanding the mechanisms of proteins in cells. Many computational approaches have thus been developed to identify protein complexes in protein-protein interaction (PPI) networks. Regarding the information that can be adopted by computational approaches to identify protein complexes, in addition to the graph topology of PPI network, the consideration of functional information of proteins has been becoming popular recently. Relevant approaches perform their tasks by relying on the idea that proteins in the same protein complex may be associated with similar functional information. However, we note from our previous researches that for most protein complexes their proteins are only similar in specific subsets of categories of functional information instead of the entire set. Hence, if the preference of each functional category can also be taken into account when identifying protein complexes, the accuracy will be improved. RESULTS: To implement the idea, we first introduce a preference vector for each of proteins to quantitatively indicate the preference of each functional category when deciding the protein complex this protein belongs to. Integrating functional preferences of proteins and the graph topology of PPI network, we formulate the problem of identifying protein complexes into a constrained optimization problem, and we propose the approach DCAFP to address it. For performance evaluation, we have conducted extensive experiments with several PPI networks from the species of Saccharomyces cerevisiae and Human and also compared DCAFP with state-of-the-art approaches in the identification of protein complexes. The experimental results show that considering the integration of functional preferences and dense structures improved the performance of identifying protein complexes, as DCAFP outperformed the other approaches for most of PPI networks based on the assessments of independent measures of f-measure, Accuracy and Maximum Matching Rate. Furthermore, the function enrichment experiments indicated that DCAFP identified more protein complexes with functional significance when compared with approaches, such as PCIA, that also utilize the functional information. CONCLUSIONS: According to the promising performance of DCAFP, the integration of functional preferences and dense structures has made it possible to identify protein complexes more accurately and significantly. BioMed Central 2015-05-27 /pmc/articles/PMC4445992/ /pubmed/26013799 http://dx.doi.org/10.1186/s12859-015-0583-3 Text en © Hu and Chan; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Hu, Lun
Chan, Keith CC
A density-based clustering approach for identifying overlapping protein complexes with functional preferences
title A density-based clustering approach for identifying overlapping protein complexes with functional preferences
title_full A density-based clustering approach for identifying overlapping protein complexes with functional preferences
title_fullStr A density-based clustering approach for identifying overlapping protein complexes with functional preferences
title_full_unstemmed A density-based clustering approach for identifying overlapping protein complexes with functional preferences
title_short A density-based clustering approach for identifying overlapping protein complexes with functional preferences
title_sort density-based clustering approach for identifying overlapping protein complexes with functional preferences
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4445992/
https://www.ncbi.nlm.nih.gov/pubmed/26013799
http://dx.doi.org/10.1186/s12859-015-0583-3
work_keys_str_mv AT hulun adensitybasedclusteringapproachforidentifyingoverlappingproteincomplexeswithfunctionalpreferences
AT chankeithcc adensitybasedclusteringapproachforidentifyingoverlappingproteincomplexeswithfunctionalpreferences
AT hulun densitybasedclusteringapproachforidentifyingoverlappingproteincomplexeswithfunctionalpreferences
AT chankeithcc densitybasedclusteringapproachforidentifyingoverlappingproteincomplexeswithfunctionalpreferences