Cargando…

Computational Performance and Statistical Accuracy of *BEAST and Comparisons with Other Methods

Under the multispecies coalescent model of molecular evolution, gene trees have independent evolutionary histories within a shared species tree. In comparison, supermatrix concatenation methods assume that gene trees share a single common genealogical history, thereby equating gene coalescence with...

Descripción completa

Detalles Bibliográficos
Autores principales: Ogilvie, Huw A., Heled, Joseph, Xie, Dong, Drummond, Alexei J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4851174/
https://www.ncbi.nlm.nih.gov/pubmed/26821913
http://dx.doi.org/10.1093/sysbio/syv118
_version_ 1782429785361743872
author Ogilvie, Huw A.
Heled, Joseph
Xie, Dong
Drummond, Alexei J.
author_facet Ogilvie, Huw A.
Heled, Joseph
Xie, Dong
Drummond, Alexei J.
author_sort Ogilvie, Huw A.
collection PubMed
description Under the multispecies coalescent model of molecular evolution, gene trees have independent evolutionary histories within a shared species tree. In comparison, supermatrix concatenation methods assume that gene trees share a single common genealogical history, thereby equating gene coalescence with species divergence. The multispecies coalescent is supported by previous studies which found that its predicted distributions fit empirical data, and that concatenation is not a consistent estimator of the species tree. *BEAST, a fully Bayesian implementation of the multispecies coalescent, is popular but computationally intensive, so the increasing size of phylogenetic data sets is both a computational challenge and an opportunity for better systematics. Using simulation studies, we characterize the scaling behavior of *BEAST, and enable quantitative prediction of the impact increasing the number of loci has on both computational performance and statistical accuracy. Follow-up simulations over a wide range of parameters show that the statistical performance of *BEAST relative to concatenation improves both as branch length is reduced and as the number of loci is increased. Finally, using simulations based on estimated parameters from two phylogenomic data sets, we compare the performance of a range of species tree and concatenation methods to show that using *BEAST with tens of loci can be preferable to using concatenation with thousands of loci. Our results provide insight into the practicalities of Bayesian species tree estimation, the number of loci required to obtain a given level of accuracy and the situations in which supermatrix or summary methods will be outperformed by the fully Bayesian multispecies coalescent.
format Online
Article
Text
id pubmed-4851174
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-48511742016-05-02 Computational Performance and Statistical Accuracy of *BEAST and Comparisons with Other Methods Ogilvie, Huw A. Heled, Joseph Xie, Dong Drummond, Alexei J. Syst Biol Society of Systematic Biologists Symposium Articles Under the multispecies coalescent model of molecular evolution, gene trees have independent evolutionary histories within a shared species tree. In comparison, supermatrix concatenation methods assume that gene trees share a single common genealogical history, thereby equating gene coalescence with species divergence. The multispecies coalescent is supported by previous studies which found that its predicted distributions fit empirical data, and that concatenation is not a consistent estimator of the species tree. *BEAST, a fully Bayesian implementation of the multispecies coalescent, is popular but computationally intensive, so the increasing size of phylogenetic data sets is both a computational challenge and an opportunity for better systematics. Using simulation studies, we characterize the scaling behavior of *BEAST, and enable quantitative prediction of the impact increasing the number of loci has on both computational performance and statistical accuracy. Follow-up simulations over a wide range of parameters show that the statistical performance of *BEAST relative to concatenation improves both as branch length is reduced and as the number of loci is increased. Finally, using simulations based on estimated parameters from two phylogenomic data sets, we compare the performance of a range of species tree and concatenation methods to show that using *BEAST with tens of loci can be preferable to using concatenation with thousands of loci. Our results provide insight into the practicalities of Bayesian species tree estimation, the number of loci required to obtain a given level of accuracy and the situations in which supermatrix or summary methods will be outperformed by the fully Bayesian multispecies coalescent. Oxford University Press 2016-05 2016-01-28 /pmc/articles/PMC4851174/ /pubmed/26821913 http://dx.doi.org/10.1093/sysbio/syv118 Text en © The Author(s) 2016. Published by Oxford University Press on behalf of the Society of Systematic Biologists. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Society of Systematic Biologists Symposium Articles
Ogilvie, Huw A.
Heled, Joseph
Xie, Dong
Drummond, Alexei J.
Computational Performance and Statistical Accuracy of *BEAST and Comparisons with Other Methods
title Computational Performance and Statistical Accuracy of *BEAST and Comparisons with Other Methods
title_full Computational Performance and Statistical Accuracy of *BEAST and Comparisons with Other Methods
title_fullStr Computational Performance and Statistical Accuracy of *BEAST and Comparisons with Other Methods
title_full_unstemmed Computational Performance and Statistical Accuracy of *BEAST and Comparisons with Other Methods
title_short Computational Performance and Statistical Accuracy of *BEAST and Comparisons with Other Methods
title_sort computational performance and statistical accuracy of *beast and comparisons with other methods
topic Society of Systematic Biologists Symposium Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4851174/
https://www.ncbi.nlm.nih.gov/pubmed/26821913
http://dx.doi.org/10.1093/sysbio/syv118
work_keys_str_mv AT ogilviehuwa computationalperformanceandstatisticalaccuracyofbeastandcomparisonswithothermethods
AT heledjoseph computationalperformanceandstatisticalaccuracyofbeastandcomparisonswithothermethods
AT xiedong computationalperformanceandstatisticalaccuracyofbeastandcomparisonswithothermethods
AT drummondalexeij computationalperformanceandstatisticalaccuracyofbeastandcomparisonswithothermethods