Cargando…

Predicting co-complexed protein pairs using genomic and proteomic data integration

BACKGROUND: Identifying all protein-protein interactions in an organism is a major objective of proteomics. A related goal is to know which protein pairs are present in the same protein complex. High-throughput methods such as yeast two-hybrid (Y2H) and affinity purification coupled with mass spectr...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Lan V, Wong, Sharyl L, King, Oliver D, Roth, Frederick P
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2004
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC419405/
https://www.ncbi.nlm.nih.gov/pubmed/15090078
http://dx.doi.org/10.1186/1471-2105-5-38
_version_ 1782121441590771712
author Zhang, Lan V
Wong, Sharyl L
King, Oliver D
Roth, Frederick P
author_facet Zhang, Lan V
Wong, Sharyl L
King, Oliver D
Roth, Frederick P
author_sort Zhang, Lan V
collection PubMed
description BACKGROUND: Identifying all protein-protein interactions in an organism is a major objective of proteomics. A related goal is to know which protein pairs are present in the same protein complex. High-throughput methods such as yeast two-hybrid (Y2H) and affinity purification coupled with mass spectrometry (APMS) have been used to detect interacting proteins on a genomic scale. However, both Y2H and APMS methods have substantial false-positive rates. Aside from high-throughput interaction screens, other gene- or protein-pair characteristics may also be informative of physical interaction. Therefore it is desirable to integrate multiple datasets and utilize their different predictive value for more accurate prediction of co-complexed relationship. RESULTS: Using a supervised machine learning approach – probabilistic decision tree, we integrated high-throughput protein interaction datasets and other gene- and protein-pair characteristics to predict co-complexed pairs (CCP) of proteins. Our predictions proved more sensitive and specific than predictions based on Y2H or APMS methods alone or in combination. Among the top predictions not annotated as CCPs in our reference set (obtained from the MIPS complex catalogue), a significant fraction was found to physically interact according to a separate database (YPD, Yeast Proteome Database), and the remaining predictions may potentially represent unknown CCPs. CONCLUSIONS: We demonstrated that the probabilistic decision tree approach can be successfully used to predict co-complexed protein (CCP) pairs from other characteristics. Our top-scoring CCP predictions provide testable hypotheses for experimental validation.
format Text
id pubmed-419405
institution National Center for Biotechnology Information
language English
publishDate 2004
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-4194052004-05-28 Predicting co-complexed protein pairs using genomic and proteomic data integration Zhang, Lan V Wong, Sharyl L King, Oliver D Roth, Frederick P BMC Bioinformatics Research Article BACKGROUND: Identifying all protein-protein interactions in an organism is a major objective of proteomics. A related goal is to know which protein pairs are present in the same protein complex. High-throughput methods such as yeast two-hybrid (Y2H) and affinity purification coupled with mass spectrometry (APMS) have been used to detect interacting proteins on a genomic scale. However, both Y2H and APMS methods have substantial false-positive rates. Aside from high-throughput interaction screens, other gene- or protein-pair characteristics may also be informative of physical interaction. Therefore it is desirable to integrate multiple datasets and utilize their different predictive value for more accurate prediction of co-complexed relationship. RESULTS: Using a supervised machine learning approach – probabilistic decision tree, we integrated high-throughput protein interaction datasets and other gene- and protein-pair characteristics to predict co-complexed pairs (CCP) of proteins. Our predictions proved more sensitive and specific than predictions based on Y2H or APMS methods alone or in combination. Among the top predictions not annotated as CCPs in our reference set (obtained from the MIPS complex catalogue), a significant fraction was found to physically interact according to a separate database (YPD, Yeast Proteome Database), and the remaining predictions may potentially represent unknown CCPs. CONCLUSIONS: We demonstrated that the probabilistic decision tree approach can be successfully used to predict co-complexed protein (CCP) pairs from other characteristics. Our top-scoring CCP predictions provide testable hypotheses for experimental validation. BioMed Central 2004-04-16 /pmc/articles/PMC419405/ /pubmed/15090078 http://dx.doi.org/10.1186/1471-2105-5-38 Text en Copyright © 2004 Zhang et al; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle Research Article
Zhang, Lan V
Wong, Sharyl L
King, Oliver D
Roth, Frederick P
Predicting co-complexed protein pairs using genomic and proteomic data integration
title Predicting co-complexed protein pairs using genomic and proteomic data integration
title_full Predicting co-complexed protein pairs using genomic and proteomic data integration
title_fullStr Predicting co-complexed protein pairs using genomic and proteomic data integration
title_full_unstemmed Predicting co-complexed protein pairs using genomic and proteomic data integration
title_short Predicting co-complexed protein pairs using genomic and proteomic data integration
title_sort predicting co-complexed protein pairs using genomic and proteomic data integration
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC419405/
https://www.ncbi.nlm.nih.gov/pubmed/15090078
http://dx.doi.org/10.1186/1471-2105-5-38
work_keys_str_mv AT zhanglanv predictingcocomplexedproteinpairsusinggenomicandproteomicdataintegration
AT wongsharyll predictingcocomplexedproteinpairsusinggenomicandproteomicdataintegration
AT kingoliverd predictingcocomplexedproteinpairsusinggenomicandproteomicdataintegration
AT rothfrederickp predictingcocomplexedproteinpairsusinggenomicandproteomicdataintegration