Cargando…

DISCO+QR: rooting species trees in the presence of GDL and ILS

MOTIVATION: Genes evolve under processes such as gene duplication and loss (GDL), so that gene family trees are multi-copy, as well as incomplete lineage sorting (ILS); both processes produce gene trees that differ from the species tree. The estimation of species trees from sets of gene family trees...

Descripción completa

Detalles Bibliográficos
Autores principales: Willson, James, Tabatabaee, Yasamin, Liu, Baqiao, Warnow, Tandy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9923442/
https://www.ncbi.nlm.nih.gov/pubmed/36789293
http://dx.doi.org/10.1093/bioadv/vbad015
Descripción
Sumario:MOTIVATION: Genes evolve under processes such as gene duplication and loss (GDL), so that gene family trees are multi-copy, as well as incomplete lineage sorting (ILS); both processes produce gene trees that differ from the species tree. The estimation of species trees from sets of gene family trees is challenging, and the estimation of rooted species trees presents additional analytical challenges. Two of the methods developed for this problem are STRIDE, which roots species trees by considering GDL events, and Quintet Rooting (QR), which roots species trees by considering ILS. RESULTS: We present DISCO+QR, a new approach to rooting species trees that first uses DISCO to address GDL and then uses QR to perform rooting in the presence of ILS. DISCO+QR operates by taking the input gene family trees and decomposing them into single-copy trees using DISCO and then roots the given species tree using the information in the single-copy gene trees using QR. We show that the relative accuracy of STRIDE and DISCO+QR depend on the properties of the dataset (number of species, genes, rate of gene duplication, degree of ILS and gene tree estimation error), and that each provides advantages over the other under some conditions. AVAILABILITY AND IMPLEMENTATION: DISCO and QR are available in github. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.