Cargando…

Efficient Bayesian inference under the structured coalescent

Motivation: Population structure significantly affects evolutionary dynamics. Such structure may be due to spatial segregation, but may also reflect any other gene-flow-limiting aspect of a model. In combination with the structured coalescent, this fact can be used to inform phylogenetic tree recons...

Descripción completa

Detalles Bibliográficos
Autores principales: Vaughan, Timothy G., Kühnert, Denise, Popinga, Alex, Welch, David, Drummond, Alexei J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4207426/
https://www.ncbi.nlm.nih.gov/pubmed/24753484
http://dx.doi.org/10.1093/bioinformatics/btu201
_version_ 1782340965315379200
author Vaughan, Timothy G.
Kühnert, Denise
Popinga, Alex
Welch, David
Drummond, Alexei J.
author_facet Vaughan, Timothy G.
Kühnert, Denise
Popinga, Alex
Welch, David
Drummond, Alexei J.
author_sort Vaughan, Timothy G.
collection PubMed
description Motivation: Population structure significantly affects evolutionary dynamics. Such structure may be due to spatial segregation, but may also reflect any other gene-flow-limiting aspect of a model. In combination with the structured coalescent, this fact can be used to inform phylogenetic tree reconstruction, as well as to infer parameters such as migration rates and subpopulation sizes from annotated sequence data. However, conducting Bayesian inference under the structured coalescent is impeded by the difficulty of constructing Markov Chain Monte Carlo (MCMC) sampling algorithms (samplers) capable of efficiently exploring the state space. Results: In this article, we present a new MCMC sampler capable of sampling from posterior distributions over structured trees: timed phylogenetic trees in which lineages are associated with the distinct subpopulation in which they lie. The sampler includes a set of MCMC proposal functions that offer significant mixing improvements over a previously published method. Furthermore, its implementation as a BEAST 2 package ensures maximum flexibility with respect to model and prior specification. We demonstrate the usefulness of this new sampler by using it to infer migration rates and effective population sizes of H3N2 influenza between New Zealand, New York and Hong Kong from publicly available hemagglutinin (HA) gene sequences under the structured coalescent. Availability and implementation: The sampler has been implemented as a publicly available BEAST 2 package that is distributed under version 3 of the GNU General Public License at http://compevol.github.io/MultiTypeTree. Contact: tgvaughan@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-4207426
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-42074262014-10-28 Efficient Bayesian inference under the structured coalescent Vaughan, Timothy G. Kühnert, Denise Popinga, Alex Welch, David Drummond, Alexei J. Bioinformatics Original Papers Motivation: Population structure significantly affects evolutionary dynamics. Such structure may be due to spatial segregation, but may also reflect any other gene-flow-limiting aspect of a model. In combination with the structured coalescent, this fact can be used to inform phylogenetic tree reconstruction, as well as to infer parameters such as migration rates and subpopulation sizes from annotated sequence data. However, conducting Bayesian inference under the structured coalescent is impeded by the difficulty of constructing Markov Chain Monte Carlo (MCMC) sampling algorithms (samplers) capable of efficiently exploring the state space. Results: In this article, we present a new MCMC sampler capable of sampling from posterior distributions over structured trees: timed phylogenetic trees in which lineages are associated with the distinct subpopulation in which they lie. The sampler includes a set of MCMC proposal functions that offer significant mixing improvements over a previously published method. Furthermore, its implementation as a BEAST 2 package ensures maximum flexibility with respect to model and prior specification. We demonstrate the usefulness of this new sampler by using it to infer migration rates and effective population sizes of H3N2 influenza between New Zealand, New York and Hong Kong from publicly available hemagglutinin (HA) gene sequences under the structured coalescent. Availability and implementation: The sampler has been implemented as a publicly available BEAST 2 package that is distributed under version 3 of the GNU General Public License at http://compevol.github.io/MultiTypeTree. Contact: tgvaughan@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2014-08-15 2014-04-20 /pmc/articles/PMC4207426/ /pubmed/24753484 http://dx.doi.org/10.1093/bioinformatics/btu201 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Vaughan, Timothy G.
Kühnert, Denise
Popinga, Alex
Welch, David
Drummond, Alexei J.
Efficient Bayesian inference under the structured coalescent
title Efficient Bayesian inference under the structured coalescent
title_full Efficient Bayesian inference under the structured coalescent
title_fullStr Efficient Bayesian inference under the structured coalescent
title_full_unstemmed Efficient Bayesian inference under the structured coalescent
title_short Efficient Bayesian inference under the structured coalescent
title_sort efficient bayesian inference under the structured coalescent
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4207426/
https://www.ncbi.nlm.nih.gov/pubmed/24753484
http://dx.doi.org/10.1093/bioinformatics/btu201
work_keys_str_mv AT vaughantimothyg efficientbayesianinferenceunderthestructuredcoalescent
AT kuhnertdenise efficientbayesianinferenceunderthestructuredcoalescent
AT popingaalex efficientbayesianinferenceunderthestructuredcoalescent
AT welchdavid efficientbayesianinferenceunderthestructuredcoalescent
AT drummondalexeij efficientbayesianinferenceunderthestructuredcoalescent