Cargando…

DISCO+QR: rooting species trees in the presence of GDL and ILS

MOTIVATION: Genes evolve under processes such as gene duplication and loss (GDL), so that gene family trees are multi-copy, as well as incomplete lineage sorting (ILS); both processes produce gene trees that differ from the species tree. The estimation of species trees from sets of gene family trees...

Descripción completa

Detalles Bibliográficos
Autores principales: Willson, James, Tabatabaee, Yasamin, Liu, Baqiao, Warnow, Tandy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9923442/
https://www.ncbi.nlm.nih.gov/pubmed/36789293
http://dx.doi.org/10.1093/bioadv/vbad015
_version_ 1784887740784967680
author Willson, James
Tabatabaee, Yasamin
Liu, Baqiao
Warnow, Tandy
author_facet Willson, James
Tabatabaee, Yasamin
Liu, Baqiao
Warnow, Tandy
author_sort Willson, James
collection PubMed
description MOTIVATION: Genes evolve under processes such as gene duplication and loss (GDL), so that gene family trees are multi-copy, as well as incomplete lineage sorting (ILS); both processes produce gene trees that differ from the species tree. The estimation of species trees from sets of gene family trees is challenging, and the estimation of rooted species trees presents additional analytical challenges. Two of the methods developed for this problem are STRIDE, which roots species trees by considering GDL events, and Quintet Rooting (QR), which roots species trees by considering ILS. RESULTS: We present DISCO+QR, a new approach to rooting species trees that first uses DISCO to address GDL and then uses QR to perform rooting in the presence of ILS. DISCO+QR operates by taking the input gene family trees and decomposing them into single-copy trees using DISCO and then roots the given species tree using the information in the single-copy gene trees using QR. We show that the relative accuracy of STRIDE and DISCO+QR depend on the properties of the dataset (number of species, genes, rate of gene duplication, degree of ILS and gene tree estimation error), and that each provides advantages over the other under some conditions. AVAILABILITY AND IMPLEMENTATION: DISCO and QR are available in github. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.
format Online
Article
Text
id pubmed-9923442
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-99234422023-02-13 DISCO+QR: rooting species trees in the presence of GDL and ILS Willson, James Tabatabaee, Yasamin Liu, Baqiao Warnow, Tandy Bioinform Adv Original Paper MOTIVATION: Genes evolve under processes such as gene duplication and loss (GDL), so that gene family trees are multi-copy, as well as incomplete lineage sorting (ILS); both processes produce gene trees that differ from the species tree. The estimation of species trees from sets of gene family trees is challenging, and the estimation of rooted species trees presents additional analytical challenges. Two of the methods developed for this problem are STRIDE, which roots species trees by considering GDL events, and Quintet Rooting (QR), which roots species trees by considering ILS. RESULTS: We present DISCO+QR, a new approach to rooting species trees that first uses DISCO to address GDL and then uses QR to perform rooting in the presence of ILS. DISCO+QR operates by taking the input gene family trees and decomposing them into single-copy trees using DISCO and then roots the given species tree using the information in the single-copy gene trees using QR. We show that the relative accuracy of STRIDE and DISCO+QR depend on the properties of the dataset (number of species, genes, rate of gene duplication, degree of ILS and gene tree estimation error), and that each provides advantages over the other under some conditions. AVAILABILITY AND IMPLEMENTATION: DISCO and QR are available in github. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2023-02-07 /pmc/articles/PMC9923442/ /pubmed/36789293 http://dx.doi.org/10.1093/bioadv/vbad015 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Willson, James
Tabatabaee, Yasamin
Liu, Baqiao
Warnow, Tandy
DISCO+QR: rooting species trees in the presence of GDL and ILS
title DISCO+QR: rooting species trees in the presence of GDL and ILS
title_full DISCO+QR: rooting species trees in the presence of GDL and ILS
title_fullStr DISCO+QR: rooting species trees in the presence of GDL and ILS
title_full_unstemmed DISCO+QR: rooting species trees in the presence of GDL and ILS
title_short DISCO+QR: rooting species trees in the presence of GDL and ILS
title_sort disco+qr: rooting species trees in the presence of gdl and ils
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9923442/
https://www.ncbi.nlm.nih.gov/pubmed/36789293
http://dx.doi.org/10.1093/bioadv/vbad015
work_keys_str_mv AT willsonjames discoqrrootingspeciestreesinthepresenceofgdlandils
AT tabatabaeeyasamin discoqrrootingspeciestreesinthepresenceofgdlandils
AT liubaqiao discoqrrootingspeciestreesinthepresenceofgdlandils
AT warnowtandy discoqrrootingspeciestreesinthepresenceofgdlandils