Cargando…

The ortholog conjecture revisited: the value of orthologs and paralogs in function prediction

MOTIVATION: The computational prediction of gene function is a key step in making full use of newly sequenced genomes. Function is generally predicted by transferring annotations from homologous genes or proteins for which experimental evidence exists. The ‘ortholog conjecture’ proposes that ortholo...

Descripción completa

Detalles Bibliográficos
Autores principales: Stamboulian, Moses, Guerrero, Rafael F, Hahn, Matthew W, Radivojac, Predrag
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355290/
https://www.ncbi.nlm.nih.gov/pubmed/32657391
http://dx.doi.org/10.1093/bioinformatics/btaa468
_version_ 1783558245767446528
author Stamboulian, Moses
Guerrero, Rafael F
Hahn, Matthew W
Radivojac, Predrag
author_facet Stamboulian, Moses
Guerrero, Rafael F
Hahn, Matthew W
Radivojac, Predrag
author_sort Stamboulian, Moses
collection PubMed
description MOTIVATION: The computational prediction of gene function is a key step in making full use of newly sequenced genomes. Function is generally predicted by transferring annotations from homologous genes or proteins for which experimental evidence exists. The ‘ortholog conjecture’ proposes that orthologous genes should be preferred when making such predictions, as they evolve functions more slowly than paralogous genes. Previous research has provided little support for the ortholog conjecture, though the incomplete nature of the data cast doubt on the conclusions. RESULTS: We use experimental annotations from over 40 000 proteins, drawn from over 80 000 publications, to revisit the ortholog conjecture in two pairs of species: (i) Homo sapiens and Mus musculus and (ii) Saccharomyces cerevisiae and Schizosaccharomyces pombe. By making a distinction between questions about the evolution of function versus questions about the prediction of function, we find strong evidence against the ortholog conjecture in the context of function prediction, though questions about the evolution of function remain difficult to address. In both pairs of species, we quantify the amount of information that would be ignored if paralogs are discarded, as well as the resulting loss in prediction accuracy. Taken as a whole, our results support the view that the types of homologs used for function transfer are largely irrelevant to the task of function prediction. Maximizing the amount of data used for this task, regardless of whether it comes from orthologs or paralogs, is most likely to lead to higher prediction accuracy. AVAILABILITY AND IMPLEMENTATION: https://github.com/predragradivojac/oc. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-7355290
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-73552902020-07-16 The ortholog conjecture revisited: the value of orthologs and paralogs in function prediction Stamboulian, Moses Guerrero, Rafael F Hahn, Matthew W Radivojac, Predrag Bioinformatics Macromolecular Sequence, Structure, and Function MOTIVATION: The computational prediction of gene function is a key step in making full use of newly sequenced genomes. Function is generally predicted by transferring annotations from homologous genes or proteins for which experimental evidence exists. The ‘ortholog conjecture’ proposes that orthologous genes should be preferred when making such predictions, as they evolve functions more slowly than paralogous genes. Previous research has provided little support for the ortholog conjecture, though the incomplete nature of the data cast doubt on the conclusions. RESULTS: We use experimental annotations from over 40 000 proteins, drawn from over 80 000 publications, to revisit the ortholog conjecture in two pairs of species: (i) Homo sapiens and Mus musculus and (ii) Saccharomyces cerevisiae and Schizosaccharomyces pombe. By making a distinction between questions about the evolution of function versus questions about the prediction of function, we find strong evidence against the ortholog conjecture in the context of function prediction, though questions about the evolution of function remain difficult to address. In both pairs of species, we quantify the amount of information that would be ignored if paralogs are discarded, as well as the resulting loss in prediction accuracy. Taken as a whole, our results support the view that the types of homologs used for function transfer are largely irrelevant to the task of function prediction. Maximizing the amount of data used for this task, regardless of whether it comes from orthologs or paralogs, is most likely to lead to higher prediction accuracy. AVAILABILITY AND IMPLEMENTATION: https://github.com/predragradivojac/oc. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-07 2020-07-13 /pmc/articles/PMC7355290/ /pubmed/32657391 http://dx.doi.org/10.1093/bioinformatics/btaa468 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Macromolecular Sequence, Structure, and Function
Stamboulian, Moses
Guerrero, Rafael F
Hahn, Matthew W
Radivojac, Predrag
The ortholog conjecture revisited: the value of orthologs and paralogs in function prediction
title The ortholog conjecture revisited: the value of orthologs and paralogs in function prediction
title_full The ortholog conjecture revisited: the value of orthologs and paralogs in function prediction
title_fullStr The ortholog conjecture revisited: the value of orthologs and paralogs in function prediction
title_full_unstemmed The ortholog conjecture revisited: the value of orthologs and paralogs in function prediction
title_short The ortholog conjecture revisited: the value of orthologs and paralogs in function prediction
title_sort ortholog conjecture revisited: the value of orthologs and paralogs in function prediction
topic Macromolecular Sequence, Structure, and Function
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355290/
https://www.ncbi.nlm.nih.gov/pubmed/32657391
http://dx.doi.org/10.1093/bioinformatics/btaa468
work_keys_str_mv AT stamboulianmoses theorthologconjecturerevisitedthevalueoforthologsandparalogsinfunctionprediction
AT guerrerorafaelf theorthologconjecturerevisitedthevalueoforthologsandparalogsinfunctionprediction
AT hahnmattheww theorthologconjecturerevisitedthevalueoforthologsandparalogsinfunctionprediction
AT radivojacpredrag theorthologconjecturerevisitedthevalueoforthologsandparalogsinfunctionprediction
AT stamboulianmoses orthologconjecturerevisitedthevalueoforthologsandparalogsinfunctionprediction
AT guerrerorafaelf orthologconjecturerevisitedthevalueoforthologsandparalogsinfunctionprediction
AT hahnmattheww orthologconjecturerevisitedthevalueoforthologsandparalogsinfunctionprediction
AT radivojacpredrag orthologconjecturerevisitedthevalueoforthologsandparalogsinfunctionprediction