Cargando…
Integrating domain similarity to improve protein complexes identification in TAP-MS data
BACKGROUND: Detecting protein complexes in protein-protein interaction (PPI) networks plays an important role in improving our understanding of the dynamic of cellular organisation. However, protein interaction data generated by high-throughput experiments such as yeast-two-hybrid (Y2H) and tandem a...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3907791/ https://www.ncbi.nlm.nih.gov/pubmed/24565259 http://dx.doi.org/10.1186/1477-5956-11-S1-S2 |
_version_ | 1782301653576187904 |
---|---|
author | Cai, Bingjing Wang, Haiying Zheng, Huiru Wang, Hui |
author_facet | Cai, Bingjing Wang, Haiying Zheng, Huiru Wang, Hui |
author_sort | Cai, Bingjing |
collection | PubMed |
description | BACKGROUND: Detecting protein complexes in protein-protein interaction (PPI) networks plays an important role in improving our understanding of the dynamic of cellular organisation. However, protein interaction data generated by high-throughput experiments such as yeast-two-hybrid (Y2H) and tandem affinity-purification/mass-spectrometry (TAP-MS) are characterised by the presence of a significant number of false positives and false negatives. In recent years there has been a growing trend to incorporate diverse domain knowledge to support large-scale analysis of PPI networks. METHODS: This paper presents a new algorithm, by incorporating Gene Ontology (GO) based semantic similarities, to detect protein complexes from PPI networks generated by TAP-MS. By taking co-complex relations in TAP-MS data into account, TAP-MS PPI networks are modelled as bipartite graph, where bait proteins consist of one set of nodes and prey proteins are on the other. Similarities between pairs of bait proteins are computed by considering both the topological features and GO-driven semantic similarities. Bait proteins are then grouped in to sets of clusters based on their pair-wise similarities to produce a set of 'seed' clusters. An expansion process is applied to each 'seed' cluster to recruit prey proteins which are significantly associated with the same set of bait proteins. Thus, completely identified protein complexes are then obtained. RESULTS: The proposed algorithm has been applied to real TAP-MS PPI networks. Fifteen quality measures have been employed to evaluate the quality of generated protein complexes. Experimental results show that the proposed algorithm has greatly improved the accuracy of identifying complexes and outperformed several state-of-the-art clustering algorithms. Moreover, by incorporating semantic similarity, the proposed algorithm is more robust to noises in the networks. |
format | Online Article Text |
id | pubmed-3907791 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-39077912014-02-13 Integrating domain similarity to improve protein complexes identification in TAP-MS data Cai, Bingjing Wang, Haiying Zheng, Huiru Wang, Hui Proteome Sci Research BACKGROUND: Detecting protein complexes in protein-protein interaction (PPI) networks plays an important role in improving our understanding of the dynamic of cellular organisation. However, protein interaction data generated by high-throughput experiments such as yeast-two-hybrid (Y2H) and tandem affinity-purification/mass-spectrometry (TAP-MS) are characterised by the presence of a significant number of false positives and false negatives. In recent years there has been a growing trend to incorporate diverse domain knowledge to support large-scale analysis of PPI networks. METHODS: This paper presents a new algorithm, by incorporating Gene Ontology (GO) based semantic similarities, to detect protein complexes from PPI networks generated by TAP-MS. By taking co-complex relations in TAP-MS data into account, TAP-MS PPI networks are modelled as bipartite graph, where bait proteins consist of one set of nodes and prey proteins are on the other. Similarities between pairs of bait proteins are computed by considering both the topological features and GO-driven semantic similarities. Bait proteins are then grouped in to sets of clusters based on their pair-wise similarities to produce a set of 'seed' clusters. An expansion process is applied to each 'seed' cluster to recruit prey proteins which are significantly associated with the same set of bait proteins. Thus, completely identified protein complexes are then obtained. RESULTS: The proposed algorithm has been applied to real TAP-MS PPI networks. Fifteen quality measures have been employed to evaluate the quality of generated protein complexes. Experimental results show that the proposed algorithm has greatly improved the accuracy of identifying complexes and outperformed several state-of-the-art clustering algorithms. Moreover, by incorporating semantic similarity, the proposed algorithm is more robust to noises in the networks. BioMed Central 2013-11-07 /pmc/articles/PMC3907791/ /pubmed/24565259 http://dx.doi.org/10.1186/1477-5956-11-S1-S2 Text en Copyright © 2013 Cai et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Cai, Bingjing Wang, Haiying Zheng, Huiru Wang, Hui Integrating domain similarity to improve protein complexes identification in TAP-MS data |
title | Integrating domain similarity to improve protein complexes identification in TAP-MS data |
title_full | Integrating domain similarity to improve protein complexes identification in TAP-MS data |
title_fullStr | Integrating domain similarity to improve protein complexes identification in TAP-MS data |
title_full_unstemmed | Integrating domain similarity to improve protein complexes identification in TAP-MS data |
title_short | Integrating domain similarity to improve protein complexes identification in TAP-MS data |
title_sort | integrating domain similarity to improve protein complexes identification in tap-ms data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3907791/ https://www.ncbi.nlm.nih.gov/pubmed/24565259 http://dx.doi.org/10.1186/1477-5956-11-S1-S2 |
work_keys_str_mv | AT caibingjing integratingdomainsimilaritytoimproveproteincomplexesidentificationintapmsdata AT wanghaiying integratingdomainsimilaritytoimproveproteincomplexesidentificationintapmsdata AT zhenghuiru integratingdomainsimilaritytoimproveproteincomplexesidentificationintapmsdata AT wanghui integratingdomainsimilaritytoimproveproteincomplexesidentificationintapmsdata |