Cargando…

Accurate prediction of orthologs in the presence of divergence after duplication

MOTIVATION: When gene duplication occurs, one of the copies may become free of selective pressure and evolve at an accelerated pace. This has important consequences on the prediction of orthology relationships, since two orthologous genes separated by divergence after duplication may differ in both...

Descripción completa

Detalles Bibliográficos
Autores principales: Lafond, Manuel, Meghdari Miardan, Mona, Sankoff, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022570/
https://www.ncbi.nlm.nih.gov/pubmed/29950018
http://dx.doi.org/10.1093/bioinformatics/bty242
_version_ 1783335706407469056
author Lafond, Manuel
Meghdari Miardan, Mona
Sankoff, David
author_facet Lafond, Manuel
Meghdari Miardan, Mona
Sankoff, David
author_sort Lafond, Manuel
collection PubMed
description MOTIVATION: When gene duplication occurs, one of the copies may become free of selective pressure and evolve at an accelerated pace. This has important consequences on the prediction of orthology relationships, since two orthologous genes separated by divergence after duplication may differ in both sequence and function. In this work, we make the distinction between the primary orthologs, which have not been affected by accelerated mutation rates on their evolutionary path, and the secondary orthologs, which have. Similarity-based prediction methods will tend to miss secondary orthologs, whereas phylogeny-based methods cannot separate primary and secondary orthologs. However, both types of orthology have applications in important areas such as gene function prediction and phylogenetic reconstruction, motivating the need for methods that can distinguish the two types. RESULTS: We formalize the notion of divergence after duplication and provide a theoretical basis for the inference of primary and secondary orthologs. We then put these ideas to practice with the Hybrid Prediction of Paralogs and Orthologs (HyPPO) framework, which combines ideas from both similarity and phylogeny approaches. We apply our method to simulated and empirical datasets and show that we achieve superior accuracy in predicting primary orthologs, secondary orthologs and paralogs. AVAILABILITY AND IMPLEMENTATION: HyPPO is a modular framework with a core developed in Python and is provided with a variety of C++ modules. The source code is available at https://github.com/manuellafond/HyPPO. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6022570
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-60225702018-07-10 Accurate prediction of orthologs in the presence of divergence after duplication Lafond, Manuel Meghdari Miardan, Mona Sankoff, David Bioinformatics Ismb 2018–Intelligent Systems for Molecular Biology Proceedings MOTIVATION: When gene duplication occurs, one of the copies may become free of selective pressure and evolve at an accelerated pace. This has important consequences on the prediction of orthology relationships, since two orthologous genes separated by divergence after duplication may differ in both sequence and function. In this work, we make the distinction between the primary orthologs, which have not been affected by accelerated mutation rates on their evolutionary path, and the secondary orthologs, which have. Similarity-based prediction methods will tend to miss secondary orthologs, whereas phylogeny-based methods cannot separate primary and secondary orthologs. However, both types of orthology have applications in important areas such as gene function prediction and phylogenetic reconstruction, motivating the need for methods that can distinguish the two types. RESULTS: We formalize the notion of divergence after duplication and provide a theoretical basis for the inference of primary and secondary orthologs. We then put these ideas to practice with the Hybrid Prediction of Paralogs and Orthologs (HyPPO) framework, which combines ideas from both similarity and phylogeny approaches. We apply our method to simulated and empirical datasets and show that we achieve superior accuracy in predicting primary orthologs, secondary orthologs and paralogs. AVAILABILITY AND IMPLEMENTATION: HyPPO is a modular framework with a core developed in Python and is provided with a variety of C++ modules. The source code is available at https://github.com/manuellafond/HyPPO. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2018-07-01 2018-06-27 /pmc/articles/PMC6022570/ /pubmed/29950018 http://dx.doi.org/10.1093/bioinformatics/bty242 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb 2018–Intelligent Systems for Molecular Biology Proceedings
Lafond, Manuel
Meghdari Miardan, Mona
Sankoff, David
Accurate prediction of orthologs in the presence of divergence after duplication
title Accurate prediction of orthologs in the presence of divergence after duplication
title_full Accurate prediction of orthologs in the presence of divergence after duplication
title_fullStr Accurate prediction of orthologs in the presence of divergence after duplication
title_full_unstemmed Accurate prediction of orthologs in the presence of divergence after duplication
title_short Accurate prediction of orthologs in the presence of divergence after duplication
title_sort accurate prediction of orthologs in the presence of divergence after duplication
topic Ismb 2018–Intelligent Systems for Molecular Biology Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022570/
https://www.ncbi.nlm.nih.gov/pubmed/29950018
http://dx.doi.org/10.1093/bioinformatics/bty242
work_keys_str_mv AT lafondmanuel accuratepredictionoforthologsinthepresenceofdivergenceafterduplication
AT meghdarimiardanmona accuratepredictionoforthologsinthepresenceofdivergenceafterduplication
AT sankoffdavid accuratepredictionoforthologsinthepresenceofdivergenceafterduplication