Cargando…

Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses

Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS...

Descripción completa

Detalles Bibliográficos
Autores principales: Lanfear, Robert, Hua, Xia, Warren, Dan L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5010905/
https://www.ncbi.nlm.nih.gov/pubmed/27435794
http://dx.doi.org/10.1093/gbe/evw171
_version_ 1782451751612317696
author Lanfear, Robert
Hua, Xia
Warren, Dan L.
author_facet Lanfear, Robert
Hua, Xia
Warren, Dan L.
author_sort Lanfear, Robert
collection PubMed
description Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS) estimates how many truly independent samples of a given parameter the output of the MCMC represents. The ESS of a parameter is frequently much lower than the number of samples taken from the MCMC because sequential samples from the chain can be non-independent due to autocorrelation. Typically, phylogeneticists use a rule of thumb that the ESS of all parameters should be greater than 200. However, we have no method to calculate an ESS of tree topology samples, despite the fact that the tree topology is often the parameter of primary interest and is almost always central to the estimation of other parameters. That is, we lack a method to determine whether we have adequately sampled one of the most important parameters in our analyses. In this study, we address this problem by developing methods to estimate the ESS for tree topologies. We combine these methods with two new diagnostic plots for assessing posterior samples of tree topologies, and compare their performance on simulated and empirical data sets. Combined, the methods we present provide new ways to assess the mixing and convergence of phylogenetic tree topologies in Bayesian MCMC analyses.
format Online
Article
Text
id pubmed-5010905
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-50109052016-09-06 Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses Lanfear, Robert Hua, Xia Warren, Dan L. Genome Biol Evol Genome Resources Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS) estimates how many truly independent samples of a given parameter the output of the MCMC represents. The ESS of a parameter is frequently much lower than the number of samples taken from the MCMC because sequential samples from the chain can be non-independent due to autocorrelation. Typically, phylogeneticists use a rule of thumb that the ESS of all parameters should be greater than 200. However, we have no method to calculate an ESS of tree topology samples, despite the fact that the tree topology is often the parameter of primary interest and is almost always central to the estimation of other parameters. That is, we lack a method to determine whether we have adequately sampled one of the most important parameters in our analyses. In this study, we address this problem by developing methods to estimate the ESS for tree topologies. We combine these methods with two new diagnostic plots for assessing posterior samples of tree topologies, and compare their performance on simulated and empirical data sets. Combined, the methods we present provide new ways to assess the mixing and convergence of phylogenetic tree topologies in Bayesian MCMC analyses. Oxford University Press 2016-07-19 /pmc/articles/PMC5010905/ /pubmed/27435794 http://dx.doi.org/10.1093/gbe/evw171 Text en © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Genome Resources
Lanfear, Robert
Hua, Xia
Warren, Dan L.
Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses
title Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses
title_full Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses
title_fullStr Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses
title_full_unstemmed Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses
title_short Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses
title_sort estimating the effective sample size of tree topologies from bayesian phylogenetic analyses
topic Genome Resources
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5010905/
https://www.ncbi.nlm.nih.gov/pubmed/27435794
http://dx.doi.org/10.1093/gbe/evw171
work_keys_str_mv AT lanfearrobert estimatingtheeffectivesamplesizeoftreetopologiesfrombayesianphylogeneticanalyses
AT huaxia estimatingtheeffectivesamplesizeoftreetopologiesfrombayesianphylogeneticanalyses
AT warrendanl estimatingtheeffectivesamplesizeoftreetopologiesfrombayesianphylogeneticanalyses