Cargando…
Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses
Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5010905/ https://www.ncbi.nlm.nih.gov/pubmed/27435794 http://dx.doi.org/10.1093/gbe/evw171 |
_version_ | 1782451751612317696 |
---|---|
author | Lanfear, Robert Hua, Xia Warren, Dan L. |
author_facet | Lanfear, Robert Hua, Xia Warren, Dan L. |
author_sort | Lanfear, Robert |
collection | PubMed |
description | Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS) estimates how many truly independent samples of a given parameter the output of the MCMC represents. The ESS of a parameter is frequently much lower than the number of samples taken from the MCMC because sequential samples from the chain can be non-independent due to autocorrelation. Typically, phylogeneticists use a rule of thumb that the ESS of all parameters should be greater than 200. However, we have no method to calculate an ESS of tree topology samples, despite the fact that the tree topology is often the parameter of primary interest and is almost always central to the estimation of other parameters. That is, we lack a method to determine whether we have adequately sampled one of the most important parameters in our analyses. In this study, we address this problem by developing methods to estimate the ESS for tree topologies. We combine these methods with two new diagnostic plots for assessing posterior samples of tree topologies, and compare their performance on simulated and empirical data sets. Combined, the methods we present provide new ways to assess the mixing and convergence of phylogenetic tree topologies in Bayesian MCMC analyses. |
format | Online Article Text |
id | pubmed-5010905 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-50109052016-09-06 Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses Lanfear, Robert Hua, Xia Warren, Dan L. Genome Biol Evol Genome Resources Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS) estimates how many truly independent samples of a given parameter the output of the MCMC represents. The ESS of a parameter is frequently much lower than the number of samples taken from the MCMC because sequential samples from the chain can be non-independent due to autocorrelation. Typically, phylogeneticists use a rule of thumb that the ESS of all parameters should be greater than 200. However, we have no method to calculate an ESS of tree topology samples, despite the fact that the tree topology is often the parameter of primary interest and is almost always central to the estimation of other parameters. That is, we lack a method to determine whether we have adequately sampled one of the most important parameters in our analyses. In this study, we address this problem by developing methods to estimate the ESS for tree topologies. We combine these methods with two new diagnostic plots for assessing posterior samples of tree topologies, and compare their performance on simulated and empirical data sets. Combined, the methods we present provide new ways to assess the mixing and convergence of phylogenetic tree topologies in Bayesian MCMC analyses. Oxford University Press 2016-07-19 /pmc/articles/PMC5010905/ /pubmed/27435794 http://dx.doi.org/10.1093/gbe/evw171 Text en © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Genome Resources Lanfear, Robert Hua, Xia Warren, Dan L. Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses |
title | Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses |
title_full | Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses |
title_fullStr | Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses |
title_full_unstemmed | Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses |
title_short | Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses |
title_sort | estimating the effective sample size of tree topologies from bayesian phylogenetic analyses |
topic | Genome Resources |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5010905/ https://www.ncbi.nlm.nih.gov/pubmed/27435794 http://dx.doi.org/10.1093/gbe/evw171 |
work_keys_str_mv | AT lanfearrobert estimatingtheeffectivesamplesizeoftreetopologiesfrombayesianphylogeneticanalyses AT huaxia estimatingtheeffectivesamplesizeoftreetopologiesfrombayesianphylogeneticanalyses AT warrendanl estimatingtheeffectivesamplesizeoftreetopologiesfrombayesianphylogeneticanalyses |