Cargando…

Most parsimonious reconciliation in the presence of gene duplication, loss, and deep coalescence using labeled coalescent trees

Accurate gene tree-species tree reconciliation is fundamental to inferring the evolutionary history of a gene family. However, although it has long been appreciated that population-related effects such as incomplete lineage sorting (ILS) can dramatically affect the gene tree, many of the most popula...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Yi-Chieh, Rasmussen, Matthew D., Bansal, Mukul S., Kellis, Manolis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3941112/
https://www.ncbi.nlm.nih.gov/pubmed/24310000
http://dx.doi.org/10.1101/gr.161968.113
_version_ 1782305872239656960
author Wu, Yi-Chieh
Rasmussen, Matthew D.
Bansal, Mukul S.
Kellis, Manolis
author_facet Wu, Yi-Chieh
Rasmussen, Matthew D.
Bansal, Mukul S.
Kellis, Manolis
author_sort Wu, Yi-Chieh
collection PubMed
description Accurate gene tree-species tree reconciliation is fundamental to inferring the evolutionary history of a gene family. However, although it has long been appreciated that population-related effects such as incomplete lineage sorting (ILS) can dramatically affect the gene tree, many of the most popular reconciliation methods consider discordance only due to gene duplication and loss (and sometimes horizontal gene transfer). Methods that do model ILS are either highly parameterized or consider a restricted set of histories, thus limiting their applicability and accuracy. To address these challenges, we present a novel algorithm DLCpar for inferring a most parsimonious (MP) history of a gene family in the presence of duplications, losses, and ILS. Our algorithm relies on a new reconciliation structure, the labeled coalescent tree (LCT), that simultaneously describes coalescent and duplication-loss history. We show that the LCT representation enables an exhaustive and efficient search over the space of reconciliations, and, for most gene families, the least common ancestor (LCA) mapping is an optimal solution for the species mapping between the gene tree and species tree in an MP LCT. Applying our algorithm to a variety of clades, including flies, fungi, and primates, as well as to simulated phylogenies, we achieve high accuracy, comparable to sophisticated probabilistic reconciliation methods, at reduced run time and with far fewer parameters. These properties enable inferences of the complex evolution of gene families across a broad range of species and large data sets.
format Online
Article
Text
id pubmed-3941112
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-39411122014-04-01 Most parsimonious reconciliation in the presence of gene duplication, loss, and deep coalescence using labeled coalescent trees Wu, Yi-Chieh Rasmussen, Matthew D. Bansal, Mukul S. Kellis, Manolis Genome Res Method Accurate gene tree-species tree reconciliation is fundamental to inferring the evolutionary history of a gene family. However, although it has long been appreciated that population-related effects such as incomplete lineage sorting (ILS) can dramatically affect the gene tree, many of the most popular reconciliation methods consider discordance only due to gene duplication and loss (and sometimes horizontal gene transfer). Methods that do model ILS are either highly parameterized or consider a restricted set of histories, thus limiting their applicability and accuracy. To address these challenges, we present a novel algorithm DLCpar for inferring a most parsimonious (MP) history of a gene family in the presence of duplications, losses, and ILS. Our algorithm relies on a new reconciliation structure, the labeled coalescent tree (LCT), that simultaneously describes coalescent and duplication-loss history. We show that the LCT representation enables an exhaustive and efficient search over the space of reconciliations, and, for most gene families, the least common ancestor (LCA) mapping is an optimal solution for the species mapping between the gene tree and species tree in an MP LCT. Applying our algorithm to a variety of clades, including flies, fungi, and primates, as well as to simulated phylogenies, we achieve high accuracy, comparable to sophisticated probabilistic reconciliation methods, at reduced run time and with far fewer parameters. These properties enable inferences of the complex evolution of gene families across a broad range of species and large data sets. Cold Spring Harbor Laboratory Press 2014-03 /pmc/articles/PMC3941112/ /pubmed/24310000 http://dx.doi.org/10.1101/gr.161968.113 Text en © 2014 Wu et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/3.0/ This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported), as described at http://creativecommons.org/licenses/by-nc/3.0/.
spellingShingle Method
Wu, Yi-Chieh
Rasmussen, Matthew D.
Bansal, Mukul S.
Kellis, Manolis
Most parsimonious reconciliation in the presence of gene duplication, loss, and deep coalescence using labeled coalescent trees
title Most parsimonious reconciliation in the presence of gene duplication, loss, and deep coalescence using labeled coalescent trees
title_full Most parsimonious reconciliation in the presence of gene duplication, loss, and deep coalescence using labeled coalescent trees
title_fullStr Most parsimonious reconciliation in the presence of gene duplication, loss, and deep coalescence using labeled coalescent trees
title_full_unstemmed Most parsimonious reconciliation in the presence of gene duplication, loss, and deep coalescence using labeled coalescent trees
title_short Most parsimonious reconciliation in the presence of gene duplication, loss, and deep coalescence using labeled coalescent trees
title_sort most parsimonious reconciliation in the presence of gene duplication, loss, and deep coalescence using labeled coalescent trees
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3941112/
https://www.ncbi.nlm.nih.gov/pubmed/24310000
http://dx.doi.org/10.1101/gr.161968.113
work_keys_str_mv AT wuyichieh mostparsimoniousreconciliationinthepresenceofgeneduplicationlossanddeepcoalescenceusinglabeledcoalescenttrees
AT rasmussenmatthewd mostparsimoniousreconciliationinthepresenceofgeneduplicationlossanddeepcoalescenceusinglabeledcoalescenttrees
AT bansalmukuls mostparsimoniousreconciliationinthepresenceofgeneduplicationlossanddeepcoalescenceusinglabeledcoalescenttrees
AT kellismanolis mostparsimoniousreconciliationinthepresenceofgeneduplicationlossanddeepcoalescenceusinglabeledcoalescenttrees