Cargando…

Improved homology-driven computational validation of protein-protein interactions motivated by the evolutionary gene duplication and divergence hypothesis

BACKGROUND: Protein-protein interaction (PPI) data sets generated by high-throughput experiments are contaminated by large numbers of erroneous PPIs. Therefore, computational methods for PPI validation are necessary to improve the quality of such data sets. Against the background of the theory that...

Descripción completa

Detalles Bibliográficos
Autores principales: Frech, Christian, Kommenda, Michael, Dorfer, Viktoria, Kern, Thomas, Hintner, Helmut, Bauer, Johann W, Önder, Kamil
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2637843/
https://www.ncbi.nlm.nih.gov/pubmed/19152684
http://dx.doi.org/10.1186/1471-2105-10-21
_version_ 1782164367583739904
author Frech, Christian
Kommenda, Michael
Dorfer, Viktoria
Kern, Thomas
Hintner, Helmut
Bauer, Johann W
Önder, Kamil
author_facet Frech, Christian
Kommenda, Michael
Dorfer, Viktoria
Kern, Thomas
Hintner, Helmut
Bauer, Johann W
Önder, Kamil
author_sort Frech, Christian
collection PubMed
description BACKGROUND: Protein-protein interaction (PPI) data sets generated by high-throughput experiments are contaminated by large numbers of erroneous PPIs. Therefore, computational methods for PPI validation are necessary to improve the quality of such data sets. Against the background of the theory that most extant PPIs arose as a consequence of gene duplication, the sensitive search for homologous PPIs, i.e. for PPIs descending from a common ancestral PPI, should be a successful strategy for PPI validation. RESULTS: To validate an experimentally observed PPI, we combine FASTA and PSI-BLAST to perform a sensitive sequence-based search for pairs of interacting homologous proteins within a large, integrated PPI database. A novel scoring scheme that incorporates both quality and quantity of all observed matches allows us (1) to consider also tentative paralogs and orthologs in this analysis and (2) to combine search results from more than one homology detection method. ROC curves illustrate the high efficacy of this approach and its improvement over other homology-based validation methods. CONCLUSION: New PPIs are primarily derived from preexisting PPIs and not invented de novo. Thus, the hallmark of true PPIs is the existence of homologous PPIs. The sensitive search for homologous PPIs within a large body of known PPIs is an efficient strategy to separate biologically relevant PPIs from the many spurious PPIs reported by high-throughput experiments.
format Text
id pubmed-2637843
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26378432009-02-10 Improved homology-driven computational validation of protein-protein interactions motivated by the evolutionary gene duplication and divergence hypothesis Frech, Christian Kommenda, Michael Dorfer, Viktoria Kern, Thomas Hintner, Helmut Bauer, Johann W Önder, Kamil BMC Bioinformatics Methodology Article BACKGROUND: Protein-protein interaction (PPI) data sets generated by high-throughput experiments are contaminated by large numbers of erroneous PPIs. Therefore, computational methods for PPI validation are necessary to improve the quality of such data sets. Against the background of the theory that most extant PPIs arose as a consequence of gene duplication, the sensitive search for homologous PPIs, i.e. for PPIs descending from a common ancestral PPI, should be a successful strategy for PPI validation. RESULTS: To validate an experimentally observed PPI, we combine FASTA and PSI-BLAST to perform a sensitive sequence-based search for pairs of interacting homologous proteins within a large, integrated PPI database. A novel scoring scheme that incorporates both quality and quantity of all observed matches allows us (1) to consider also tentative paralogs and orthologs in this analysis and (2) to combine search results from more than one homology detection method. ROC curves illustrate the high efficacy of this approach and its improvement over other homology-based validation methods. CONCLUSION: New PPIs are primarily derived from preexisting PPIs and not invented de novo. Thus, the hallmark of true PPIs is the existence of homologous PPIs. The sensitive search for homologous PPIs within a large body of known PPIs is an efficient strategy to separate biologically relevant PPIs from the many spurious PPIs reported by high-throughput experiments. BioMed Central 2009-01-19 /pmc/articles/PMC2637843/ /pubmed/19152684 http://dx.doi.org/10.1186/1471-2105-10-21 Text en Copyright © 2009 Frech et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Frech, Christian
Kommenda, Michael
Dorfer, Viktoria
Kern, Thomas
Hintner, Helmut
Bauer, Johann W
Önder, Kamil
Improved homology-driven computational validation of protein-protein interactions motivated by the evolutionary gene duplication and divergence hypothesis
title Improved homology-driven computational validation of protein-protein interactions motivated by the evolutionary gene duplication and divergence hypothesis
title_full Improved homology-driven computational validation of protein-protein interactions motivated by the evolutionary gene duplication and divergence hypothesis
title_fullStr Improved homology-driven computational validation of protein-protein interactions motivated by the evolutionary gene duplication and divergence hypothesis
title_full_unstemmed Improved homology-driven computational validation of protein-protein interactions motivated by the evolutionary gene duplication and divergence hypothesis
title_short Improved homology-driven computational validation of protein-protein interactions motivated by the evolutionary gene duplication and divergence hypothesis
title_sort improved homology-driven computational validation of protein-protein interactions motivated by the evolutionary gene duplication and divergence hypothesis
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2637843/
https://www.ncbi.nlm.nih.gov/pubmed/19152684
http://dx.doi.org/10.1186/1471-2105-10-21
work_keys_str_mv AT frechchristian improvedhomologydrivencomputationalvalidationofproteinproteininteractionsmotivatedbytheevolutionarygeneduplicationanddivergencehypothesis
AT kommendamichael improvedhomologydrivencomputationalvalidationofproteinproteininteractionsmotivatedbytheevolutionarygeneduplicationanddivergencehypothesis
AT dorferviktoria improvedhomologydrivencomputationalvalidationofproteinproteininteractionsmotivatedbytheevolutionarygeneduplicationanddivergencehypothesis
AT kernthomas improvedhomologydrivencomputationalvalidationofproteinproteininteractionsmotivatedbytheevolutionarygeneduplicationanddivergencehypothesis
AT hintnerhelmut improvedhomologydrivencomputationalvalidationofproteinproteininteractionsmotivatedbytheevolutionarygeneduplicationanddivergencehypothesis
AT bauerjohannw improvedhomologydrivencomputationalvalidationofproteinproteininteractionsmotivatedbytheevolutionarygeneduplicationanddivergencehypothesis
AT onderkamil improvedhomologydrivencomputationalvalidationofproteinproteininteractionsmotivatedbytheevolutionarygeneduplicationanddivergencehypothesis