Cargando…

Integrating domain similarity to improve protein complexes identification in TAP-MS data

BACKGROUND: Detecting protein complexes in protein-protein interaction (PPI) networks plays an important role in improving our understanding of the dynamic of cellular organisation. However, protein interaction data generated by high-throughput experiments such as yeast-two-hybrid (Y2H) and tandem a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Cai, Bingjing, Wang, Haiying, Zheng, Huiru, Wang, Hui
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2013
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3907791/ https://www.ncbi.nlm.nih.gov/pubmed/24565259 http://dx.doi.org/10.1186/1477-5956-11-S1-S2

_version_	1782301653576187904
author	Cai, Bingjing Wang, Haiying Zheng, Huiru Wang, Hui
author_facet	Cai, Bingjing Wang, Haiying Zheng, Huiru Wang, Hui
author_sort	Cai, Bingjing
collection	PubMed
description	BACKGROUND: Detecting protein complexes in protein-protein interaction (PPI) networks plays an important role in improving our understanding of the dynamic of cellular organisation. However, protein interaction data generated by high-throughput experiments such as yeast-two-hybrid (Y2H) and tandem affinity-purification/mass-spectrometry (TAP-MS) are characterised by the presence of a significant number of false positives and false negatives. In recent years there has been a growing trend to incorporate diverse domain knowledge to support large-scale analysis of PPI networks. METHODS: This paper presents a new algorithm, by incorporating Gene Ontology (GO) based semantic similarities, to detect protein complexes from PPI networks generated by TAP-MS. By taking co-complex relations in TAP-MS data into account, TAP-MS PPI networks are modelled as bipartite graph, where bait proteins consist of one set of nodes and prey proteins are on the other. Similarities between pairs of bait proteins are computed by considering both the topological features and GO-driven semantic similarities. Bait proteins are then grouped in to sets of clusters based on their pair-wise similarities to produce a set of 'seed' clusters. An expansion process is applied to each 'seed' cluster to recruit prey proteins which are significantly associated with the same set of bait proteins. Thus, completely identified protein complexes are then obtained. RESULTS: The proposed algorithm has been applied to real TAP-MS PPI networks. Fifteen quality measures have been employed to evaluate the quality of generated protein complexes. Experimental results show that the proposed algorithm has greatly improved the accuracy of identifying complexes and outperformed several state-of-the-art clustering algorithms. Moreover, by incorporating semantic similarity, the proposed algorithm is more robust to noises in the networks.
format	Online Article Text
id	pubmed-3907791
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-39077912014-02-13 Integrating domain similarity to improve protein complexes identification in TAP-MS data Cai, Bingjing Wang, Haiying Zheng, Huiru Wang, Hui Proteome Sci Research BACKGROUND: Detecting protein complexes in protein-protein interaction (PPI) networks plays an important role in improving our understanding of the dynamic of cellular organisation. However, protein interaction data generated by high-throughput experiments such as yeast-two-hybrid (Y2H) and tandem affinity-purification/mass-spectrometry (TAP-MS) are characterised by the presence of a significant number of false positives and false negatives. In recent years there has been a growing trend to incorporate diverse domain knowledge to support large-scale analysis of PPI networks. METHODS: This paper presents a new algorithm, by incorporating Gene Ontology (GO) based semantic similarities, to detect protein complexes from PPI networks generated by TAP-MS. By taking co-complex relations in TAP-MS data into account, TAP-MS PPI networks are modelled as bipartite graph, where bait proteins consist of one set of nodes and prey proteins are on the other. Similarities between pairs of bait proteins are computed by considering both the topological features and GO-driven semantic similarities. Bait proteins are then grouped in to sets of clusters based on their pair-wise similarities to produce a set of 'seed' clusters. An expansion process is applied to each 'seed' cluster to recruit prey proteins which are significantly associated with the same set of bait proteins. Thus, completely identified protein complexes are then obtained. RESULTS: The proposed algorithm has been applied to real TAP-MS PPI networks. Fifteen quality measures have been employed to evaluate the quality of generated protein complexes. Experimental results show that the proposed algorithm has greatly improved the accuracy of identifying complexes and outperformed several state-of-the-art clustering algorithms. Moreover, by incorporating semantic similarity, the proposed algorithm is more robust to noises in the networks. BioMed Central 2013-11-07 /pmc/articles/PMC3907791/ /pubmed/24565259 http://dx.doi.org/10.1186/1477-5956-11-S1-S2 Text en Copyright © 2013 Cai et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Cai, Bingjing Wang, Haiying Zheng, Huiru Wang, Hui Integrating domain similarity to improve protein complexes identification in TAP-MS data
title	Integrating domain similarity to improve protein complexes identification in TAP-MS data
title_full	Integrating domain similarity to improve protein complexes identification in TAP-MS data
title_fullStr	Integrating domain similarity to improve protein complexes identification in TAP-MS data
title_full_unstemmed	Integrating domain similarity to improve protein complexes identification in TAP-MS data
title_short	Integrating domain similarity to improve protein complexes identification in TAP-MS data
title_sort	integrating domain similarity to improve protein complexes identification in tap-ms data
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3907791/ https://www.ncbi.nlm.nih.gov/pubmed/24565259 http://dx.doi.org/10.1186/1477-5956-11-S1-S2
work_keys_str_mv	AT caibingjing integratingdomainsimilaritytoimproveproteincomplexesidentificationintapmsdata AT wanghaiying integratingdomainsimilaritytoimproveproteincomplexesidentificationintapmsdata AT zhenghuiru integratingdomainsimilaritytoimproveproteincomplexesidentificationintapmsdata AT wanghui integratingdomainsimilaritytoimproveproteincomplexesidentificationintapmsdata

Integrating domain similarity to improve protein complexes identification in TAP-MS data

Ejemplares similares