Cargando…

Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset

High-throughput studies of protein interactions may have produced, experimentally and computationally, the most comprehensive protein–protein interaction datasets in the completely sequenced genomes. It provides us an opportunity on a proteome scale, to discover the underlying protein interaction pa...

Descripción completa

Detalles Bibliográficos
Autores principales: Guo, Jie, Wu, Xiaomei, Zhang, Da-Yong, Lin, Kui
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2346601/
https://www.ncbi.nlm.nih.gov/pubmed/18281313
http://dx.doi.org/10.1093/nar/gkn016
_version_ 1782152845262323712
author Guo, Jie
Wu, Xiaomei
Zhang, Da-Yong
Lin, Kui
author_facet Guo, Jie
Wu, Xiaomei
Zhang, Da-Yong
Lin, Kui
author_sort Guo, Jie
collection PubMed
description High-throughput studies of protein interactions may have produced, experimentally and computationally, the most comprehensive protein–protein interaction datasets in the completely sequenced genomes. It provides us an opportunity on a proteome scale, to discover the underlying protein interaction patterns. Here, we propose an approach to discovering motif pairs at interaction sites (often 3–8 residues) that are essential for understanding protein functions and helpful for the rational design of protein engineering and folding experiments. A gold standard positive (interacting) dataset and a gold standard negative (non-interacting) dataset were mined to infer the interacting motif pairs that are significantly overrepresented in the positive dataset compared to the negative dataset. Four negative datasets assembled by different strategies were evaluated and the one with the best performance was used as the gold standard negatives for further analysis. Meanwhile, to assess the efficiency of our method in detecting potential interacting motif pairs, other approaches developed previously were compared, and we found that our method achieved the highest prediction accuracy. In addition, many uncharacterized motif pairs of interest were found to be functional with experimental evidence in other species. This investigation demonstrates the important effects of a high-quality negative dataset on the performance of such statistical inference.
format Text
id pubmed-2346601
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-23466012008-05-05 Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset Guo, Jie Wu, Xiaomei Zhang, Da-Yong Lin, Kui Nucleic Acids Res Computational Biology High-throughput studies of protein interactions may have produced, experimentally and computationally, the most comprehensive protein–protein interaction datasets in the completely sequenced genomes. It provides us an opportunity on a proteome scale, to discover the underlying protein interaction patterns. Here, we propose an approach to discovering motif pairs at interaction sites (often 3–8 residues) that are essential for understanding protein functions and helpful for the rational design of protein engineering and folding experiments. A gold standard positive (interacting) dataset and a gold standard negative (non-interacting) dataset were mined to infer the interacting motif pairs that are significantly overrepresented in the positive dataset compared to the negative dataset. Four negative datasets assembled by different strategies were evaluated and the one with the best performance was used as the gold standard negatives for further analysis. Meanwhile, to assess the efficiency of our method in detecting potential interacting motif pairs, other approaches developed previously were compared, and we found that our method achieved the highest prediction accuracy. In addition, many uncharacterized motif pairs of interest were found to be functional with experimental evidence in other species. This investigation demonstrates the important effects of a high-quality negative dataset on the performance of such statistical inference. Oxford University Press 2008-04 2008-02-14 /pmc/articles/PMC2346601/ /pubmed/18281313 http://dx.doi.org/10.1093/nar/gkn016 Text en © 2008 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Computational Biology
Guo, Jie
Wu, Xiaomei
Zhang, Da-Yong
Lin, Kui
Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset
title Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset
title_full Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset
title_fullStr Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset
title_full_unstemmed Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset
title_short Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset
title_sort genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2346601/
https://www.ncbi.nlm.nih.gov/pubmed/18281313
http://dx.doi.org/10.1093/nar/gkn016
work_keys_str_mv AT guojie genomewideinferenceofproteininteractionsiteslessonsfromtheyeasthighqualitynegativeproteinproteininteractiondataset
AT wuxiaomei genomewideinferenceofproteininteractionsiteslessonsfromtheyeasthighqualitynegativeproteinproteininteractiondataset
AT zhangdayong genomewideinferenceofproteininteractionsiteslessonsfromtheyeasthighqualitynegativeproteinproteininteractiondataset
AT linkui genomewideinferenceofproteininteractionsiteslessonsfromtheyeasthighqualitynegativeproteinproteininteractiondataset