Cargando…

Phylogenomic branch length estimation using quartets

MOTIVATION: Branch lengths and topology of a species tree are essential in most downstream analyses, including estimation of diversification dates, characterization of selection, understanding adaptation, and comparative genomics. Modern phylogenomic analyses often use methods that account for the h...

Descripción completa

Detalles Bibliográficos
Autores principales: Tabatabaee, Yasamin, Zhang, Chao, Warnow, Tandy, Mirarab, Siavash
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311336/
https://www.ncbi.nlm.nih.gov/pubmed/37387151
http://dx.doi.org/10.1093/bioinformatics/btad221
_version_ 1785066721961312256
author Tabatabaee, Yasamin
Zhang, Chao
Warnow, Tandy
Mirarab, Siavash
author_facet Tabatabaee, Yasamin
Zhang, Chao
Warnow, Tandy
Mirarab, Siavash
author_sort Tabatabaee, Yasamin
collection PubMed
description MOTIVATION: Branch lengths and topology of a species tree are essential in most downstream analyses, including estimation of diversification dates, characterization of selection, understanding adaptation, and comparative genomics. Modern phylogenomic analyses often use methods that account for the heterogeneity of evolutionary histories across the genome due to processes such as incomplete lineage sorting. However, these methods typically do not generate branch lengths in units that are usable by downstream applications, forcing phylogenomic analyses to resort to alternative shortcuts such as estimating branch lengths by concatenating gene alignments into a supermatrix. Yet, concatenation and other available approaches for estimating branch lengths fail to address heterogeneity across the genome. RESULTS: In this article, we derive expected values of gene tree branch lengths in substitution units under an extension of the multispecies coalescent (MSC) model that allows substitutions with varying rates across the species tree. We present CASTLES, a new technique for estimating branch lengths on the species tree from estimated gene trees that uses these expected values, and our study shows that CASTLES improves on the most accurate prior methods with respect to both speed and accuracy. AVAILABILITY AND IMPLEMENTATION: CASTLES is available at https://github.com/ytabatabaee/CASTLES.
format Online
Article
Text
id pubmed-10311336
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-103113362023-07-01 Phylogenomic branch length estimation using quartets Tabatabaee, Yasamin Zhang, Chao Warnow, Tandy Mirarab, Siavash Bioinformatics Evolutionary, Comparative and Population Genomics MOTIVATION: Branch lengths and topology of a species tree are essential in most downstream analyses, including estimation of diversification dates, characterization of selection, understanding adaptation, and comparative genomics. Modern phylogenomic analyses often use methods that account for the heterogeneity of evolutionary histories across the genome due to processes such as incomplete lineage sorting. However, these methods typically do not generate branch lengths in units that are usable by downstream applications, forcing phylogenomic analyses to resort to alternative shortcuts such as estimating branch lengths by concatenating gene alignments into a supermatrix. Yet, concatenation and other available approaches for estimating branch lengths fail to address heterogeneity across the genome. RESULTS: In this article, we derive expected values of gene tree branch lengths in substitution units under an extension of the multispecies coalescent (MSC) model that allows substitutions with varying rates across the species tree. We present CASTLES, a new technique for estimating branch lengths on the species tree from estimated gene trees that uses these expected values, and our study shows that CASTLES improves on the most accurate prior methods with respect to both speed and accuracy. AVAILABILITY AND IMPLEMENTATION: CASTLES is available at https://github.com/ytabatabaee/CASTLES. Oxford University Press 2023-06-30 /pmc/articles/PMC10311336/ /pubmed/37387151 http://dx.doi.org/10.1093/bioinformatics/btad221 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Evolutionary, Comparative and Population Genomics
Tabatabaee, Yasamin
Zhang, Chao
Warnow, Tandy
Mirarab, Siavash
Phylogenomic branch length estimation using quartets
title Phylogenomic branch length estimation using quartets
title_full Phylogenomic branch length estimation using quartets
title_fullStr Phylogenomic branch length estimation using quartets
title_full_unstemmed Phylogenomic branch length estimation using quartets
title_short Phylogenomic branch length estimation using quartets
title_sort phylogenomic branch length estimation using quartets
topic Evolutionary, Comparative and Population Genomics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311336/
https://www.ncbi.nlm.nih.gov/pubmed/37387151
http://dx.doi.org/10.1093/bioinformatics/btad221
work_keys_str_mv AT tabatabaeeyasamin phylogenomicbranchlengthestimationusingquartets
AT zhangchao phylogenomicbranchlengthestimationusingquartets
AT warnowtandy phylogenomicbranchlengthestimationusingquartets
AT mirarabsiavash phylogenomicbranchlengthestimationusingquartets