Cargando…

Integrating domain similarity to improve protein complexes identification in TAP-MS data

BACKGROUND: Detecting protein complexes in protein-protein interaction (PPI) networks plays an important role in improving our understanding of the dynamic of cellular organisation. However, protein interaction data generated by high-throughput experiments such as yeast-two-hybrid (Y2H) and tandem a...

Descripción completa

Detalles Bibliográficos
Autores principales: Cai, Bingjing, Wang, Haiying, Zheng, Huiru, Wang, Hui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3907791/
https://www.ncbi.nlm.nih.gov/pubmed/24565259
http://dx.doi.org/10.1186/1477-5956-11-S1-S2
_version_ 1782301653576187904
author Cai, Bingjing
Wang, Haiying
Zheng, Huiru
Wang, Hui
author_facet Cai, Bingjing
Wang, Haiying
Zheng, Huiru
Wang, Hui
author_sort Cai, Bingjing
collection PubMed
description BACKGROUND: Detecting protein complexes in protein-protein interaction (PPI) networks plays an important role in improving our understanding of the dynamic of cellular organisation. However, protein interaction data generated by high-throughput experiments such as yeast-two-hybrid (Y2H) and tandem affinity-purification/mass-spectrometry (TAP-MS) are characterised by the presence of a significant number of false positives and false negatives. In recent years there has been a growing trend to incorporate diverse domain knowledge to support large-scale analysis of PPI networks. METHODS: This paper presents a new algorithm, by incorporating Gene Ontology (GO) based semantic similarities, to detect protein complexes from PPI networks generated by TAP-MS. By taking co-complex relations in TAP-MS data into account, TAP-MS PPI networks are modelled as bipartite graph, where bait proteins consist of one set of nodes and prey proteins are on the other. Similarities between pairs of bait proteins are computed by considering both the topological features and GO-driven semantic similarities. Bait proteins are then grouped in to sets of clusters based on their pair-wise similarities to produce a set of 'seed' clusters. An expansion process is applied to each 'seed' cluster to recruit prey proteins which are significantly associated with the same set of bait proteins. Thus, completely identified protein complexes are then obtained. RESULTS: The proposed algorithm has been applied to real TAP-MS PPI networks. Fifteen quality measures have been employed to evaluate the quality of generated protein complexes. Experimental results show that the proposed algorithm has greatly improved the accuracy of identifying complexes and outperformed several state-of-the-art clustering algorithms. Moreover, by incorporating semantic similarity, the proposed algorithm is more robust to noises in the networks.
format Online
Article
Text
id pubmed-3907791
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-39077912014-02-13 Integrating domain similarity to improve protein complexes identification in TAP-MS data Cai, Bingjing Wang, Haiying Zheng, Huiru Wang, Hui Proteome Sci Research BACKGROUND: Detecting protein complexes in protein-protein interaction (PPI) networks plays an important role in improving our understanding of the dynamic of cellular organisation. However, protein interaction data generated by high-throughput experiments such as yeast-two-hybrid (Y2H) and tandem affinity-purification/mass-spectrometry (TAP-MS) are characterised by the presence of a significant number of false positives and false negatives. In recent years there has been a growing trend to incorporate diverse domain knowledge to support large-scale analysis of PPI networks. METHODS: This paper presents a new algorithm, by incorporating Gene Ontology (GO) based semantic similarities, to detect protein complexes from PPI networks generated by TAP-MS. By taking co-complex relations in TAP-MS data into account, TAP-MS PPI networks are modelled as bipartite graph, where bait proteins consist of one set of nodes and prey proteins are on the other. Similarities between pairs of bait proteins are computed by considering both the topological features and GO-driven semantic similarities. Bait proteins are then grouped in to sets of clusters based on their pair-wise similarities to produce a set of 'seed' clusters. An expansion process is applied to each 'seed' cluster to recruit prey proteins which are significantly associated with the same set of bait proteins. Thus, completely identified protein complexes are then obtained. RESULTS: The proposed algorithm has been applied to real TAP-MS PPI networks. Fifteen quality measures have been employed to evaluate the quality of generated protein complexes. Experimental results show that the proposed algorithm has greatly improved the accuracy of identifying complexes and outperformed several state-of-the-art clustering algorithms. Moreover, by incorporating semantic similarity, the proposed algorithm is more robust to noises in the networks. BioMed Central 2013-11-07 /pmc/articles/PMC3907791/ /pubmed/24565259 http://dx.doi.org/10.1186/1477-5956-11-S1-S2 Text en Copyright © 2013 Cai et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Cai, Bingjing
Wang, Haiying
Zheng, Huiru
Wang, Hui
Integrating domain similarity to improve protein complexes identification in TAP-MS data
title Integrating domain similarity to improve protein complexes identification in TAP-MS data
title_full Integrating domain similarity to improve protein complexes identification in TAP-MS data
title_fullStr Integrating domain similarity to improve protein complexes identification in TAP-MS data
title_full_unstemmed Integrating domain similarity to improve protein complexes identification in TAP-MS data
title_short Integrating domain similarity to improve protein complexes identification in TAP-MS data
title_sort integrating domain similarity to improve protein complexes identification in tap-ms data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3907791/
https://www.ncbi.nlm.nih.gov/pubmed/24565259
http://dx.doi.org/10.1186/1477-5956-11-S1-S2
work_keys_str_mv AT caibingjing integratingdomainsimilaritytoimproveproteincomplexesidentificationintapmsdata
AT wanghaiying integratingdomainsimilaritytoimproveproteincomplexesidentificationintapmsdata
AT zhenghuiru integratingdomainsimilaritytoimproveproteincomplexesidentificationintapmsdata
AT wanghui integratingdomainsimilaritytoimproveproteincomplexesidentificationintapmsdata