Cargando…
Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset
High-throughput studies of protein interactions may have produced, experimentally and computationally, the most comprehensive protein–protein interaction datasets in the completely sequenced genomes. It provides us an opportunity on a proteome scale, to discover the underlying protein interaction pa...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2346601/ https://www.ncbi.nlm.nih.gov/pubmed/18281313 http://dx.doi.org/10.1093/nar/gkn016 |
_version_ | 1782152845262323712 |
---|---|
author | Guo, Jie Wu, Xiaomei Zhang, Da-Yong Lin, Kui |
author_facet | Guo, Jie Wu, Xiaomei Zhang, Da-Yong Lin, Kui |
author_sort | Guo, Jie |
collection | PubMed |
description | High-throughput studies of protein interactions may have produced, experimentally and computationally, the most comprehensive protein–protein interaction datasets in the completely sequenced genomes. It provides us an opportunity on a proteome scale, to discover the underlying protein interaction patterns. Here, we propose an approach to discovering motif pairs at interaction sites (often 3–8 residues) that are essential for understanding protein functions and helpful for the rational design of protein engineering and folding experiments. A gold standard positive (interacting) dataset and a gold standard negative (non-interacting) dataset were mined to infer the interacting motif pairs that are significantly overrepresented in the positive dataset compared to the negative dataset. Four negative datasets assembled by different strategies were evaluated and the one with the best performance was used as the gold standard negatives for further analysis. Meanwhile, to assess the efficiency of our method in detecting potential interacting motif pairs, other approaches developed previously were compared, and we found that our method achieved the highest prediction accuracy. In addition, many uncharacterized motif pairs of interest were found to be functional with experimental evidence in other species. This investigation demonstrates the important effects of a high-quality negative dataset on the performance of such statistical inference. |
format | Text |
id | pubmed-2346601 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-23466012008-05-05 Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset Guo, Jie Wu, Xiaomei Zhang, Da-Yong Lin, Kui Nucleic Acids Res Computational Biology High-throughput studies of protein interactions may have produced, experimentally and computationally, the most comprehensive protein–protein interaction datasets in the completely sequenced genomes. It provides us an opportunity on a proteome scale, to discover the underlying protein interaction patterns. Here, we propose an approach to discovering motif pairs at interaction sites (often 3–8 residues) that are essential for understanding protein functions and helpful for the rational design of protein engineering and folding experiments. A gold standard positive (interacting) dataset and a gold standard negative (non-interacting) dataset were mined to infer the interacting motif pairs that are significantly overrepresented in the positive dataset compared to the negative dataset. Four negative datasets assembled by different strategies were evaluated and the one with the best performance was used as the gold standard negatives for further analysis. Meanwhile, to assess the efficiency of our method in detecting potential interacting motif pairs, other approaches developed previously were compared, and we found that our method achieved the highest prediction accuracy. In addition, many uncharacterized motif pairs of interest were found to be functional with experimental evidence in other species. This investigation demonstrates the important effects of a high-quality negative dataset on the performance of such statistical inference. Oxford University Press 2008-04 2008-02-14 /pmc/articles/PMC2346601/ /pubmed/18281313 http://dx.doi.org/10.1093/nar/gkn016 Text en © 2008 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Computational Biology Guo, Jie Wu, Xiaomei Zhang, Da-Yong Lin, Kui Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset |
title | Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset |
title_full | Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset |
title_fullStr | Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset |
title_full_unstemmed | Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset |
title_short | Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset |
title_sort | genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset |
topic | Computational Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2346601/ https://www.ncbi.nlm.nih.gov/pubmed/18281313 http://dx.doi.org/10.1093/nar/gkn016 |
work_keys_str_mv | AT guojie genomewideinferenceofproteininteractionsiteslessonsfromtheyeasthighqualitynegativeproteinproteininteractiondataset AT wuxiaomei genomewideinferenceofproteininteractionsiteslessonsfromtheyeasthighqualitynegativeproteinproteininteractiondataset AT zhangdayong genomewideinferenceofproteininteractionsiteslessonsfromtheyeasthighqualitynegativeproteinproteininteractiondataset AT linkui genomewideinferenceofproteininteractionsiteslessonsfromtheyeasthighqualitynegativeproteinproteininteractiondataset |