Cargando…

Fast and accurate branch lengths estimation for phylogenomic trees

BACKGROUND: Branch lengths are an important attribute of phylogenetic trees, providing essential information for many studies in evolutionary biology. Yet, part of the current methodology to reconstruct a phylogeny from genomic information — namely supertree methods — focuses on the topology or stru...

Descripción completa

Detalles Bibliográficos
Autores principales: Binet, Manuel, Gascuel, Olivier, Scornavacca, Celine, P. Douzery, Emmanuel J., Pardi, Fabio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4705742/
https://www.ncbi.nlm.nih.gov/pubmed/26744021
http://dx.doi.org/10.1186/s12859-015-0821-8
_version_ 1782409072857841664
author Binet, Manuel
Gascuel, Olivier
Scornavacca, Celine
P. Douzery, Emmanuel J.
Pardi, Fabio
author_facet Binet, Manuel
Gascuel, Olivier
Scornavacca, Celine
P. Douzery, Emmanuel J.
Pardi, Fabio
author_sort Binet, Manuel
collection PubMed
description BACKGROUND: Branch lengths are an important attribute of phylogenetic trees, providing essential information for many studies in evolutionary biology. Yet, part of the current methodology to reconstruct a phylogeny from genomic information — namely supertree methods — focuses on the topology or structure of the phylogenetic tree, rather than the evolutionary divergences associated to it. Moreover, accurate methods to estimate branch lengths — typically based on probabilistic analysis of a concatenated alignment — are limited by large demands in memory and computing time, and may become impractical when the data sets are too large. RESULTS: Here, we present a novel phylogenomic distance-based method, named ERaBLE (Evolutionary Rates and Branch Length Estimation), to estimate the branch lengths of a given reference topology, and the relative evolutionary rates of the genes employed in the analysis. ERaBLE uses as input data a potentially very large collection of distance matrices, where each matrix is obtained from a different genomic region — either directly from its sequence alignment, or indirectly from a gene tree inferred from the alignment. Our experiments show that ERaBLE is very fast and fairly accurate when compared to other possible approaches for the same tasks. Specifically, it efficiently and accurately deals with large data sets, such as the OrthoMaM v8 database, composed of 6,953 exons from up to 40 mammals. CONCLUSIONS: ERaBLE may be used as a complement to supertree methods — or it may provide an efficient alternative to maximum likelihood analysis of concatenated alignments — to estimate branch lengths from phylogenomic data sets. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0821-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4705742
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47057422016-01-09 Fast and accurate branch lengths estimation for phylogenomic trees Binet, Manuel Gascuel, Olivier Scornavacca, Celine P. Douzery, Emmanuel J. Pardi, Fabio BMC Bioinformatics Research Article BACKGROUND: Branch lengths are an important attribute of phylogenetic trees, providing essential information for many studies in evolutionary biology. Yet, part of the current methodology to reconstruct a phylogeny from genomic information — namely supertree methods — focuses on the topology or structure of the phylogenetic tree, rather than the evolutionary divergences associated to it. Moreover, accurate methods to estimate branch lengths — typically based on probabilistic analysis of a concatenated alignment — are limited by large demands in memory and computing time, and may become impractical when the data sets are too large. RESULTS: Here, we present a novel phylogenomic distance-based method, named ERaBLE (Evolutionary Rates and Branch Length Estimation), to estimate the branch lengths of a given reference topology, and the relative evolutionary rates of the genes employed in the analysis. ERaBLE uses as input data a potentially very large collection of distance matrices, where each matrix is obtained from a different genomic region — either directly from its sequence alignment, or indirectly from a gene tree inferred from the alignment. Our experiments show that ERaBLE is very fast and fairly accurate when compared to other possible approaches for the same tasks. Specifically, it efficiently and accurately deals with large data sets, such as the OrthoMaM v8 database, composed of 6,953 exons from up to 40 mammals. CONCLUSIONS: ERaBLE may be used as a complement to supertree methods — or it may provide an efficient alternative to maximum likelihood analysis of concatenated alignments — to estimate branch lengths from phylogenomic data sets. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0821-8) contains supplementary material, which is available to authorized users. BioMed Central 2016-01-07 /pmc/articles/PMC4705742/ /pubmed/26744021 http://dx.doi.org/10.1186/s12859-015-0821-8 Text en © Binet et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Binet, Manuel
Gascuel, Olivier
Scornavacca, Celine
P. Douzery, Emmanuel J.
Pardi, Fabio
Fast and accurate branch lengths estimation for phylogenomic trees
title Fast and accurate branch lengths estimation for phylogenomic trees
title_full Fast and accurate branch lengths estimation for phylogenomic trees
title_fullStr Fast and accurate branch lengths estimation for phylogenomic trees
title_full_unstemmed Fast and accurate branch lengths estimation for phylogenomic trees
title_short Fast and accurate branch lengths estimation for phylogenomic trees
title_sort fast and accurate branch lengths estimation for phylogenomic trees
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4705742/
https://www.ncbi.nlm.nih.gov/pubmed/26744021
http://dx.doi.org/10.1186/s12859-015-0821-8
work_keys_str_mv AT binetmanuel fastandaccuratebranchlengthsestimationforphylogenomictrees
AT gascuelolivier fastandaccuratebranchlengthsestimationforphylogenomictrees
AT scornavaccaceline fastandaccuratebranchlengthsestimationforphylogenomictrees
AT pdouzeryemmanuelj fastandaccuratebranchlengthsestimationforphylogenomictrees
AT pardifabio fastandaccuratebranchlengthsestimationforphylogenomictrees