Cargando…

Efficient Bayesian Species Tree Inference under the Multispecies Coalescent

We develop a Bayesian method for inferring the species phylogeny under the multispecies coalescent (MSC) model. To improve the mixing properties of the Markov chain Monte Carlo (MCMC) algorithm that traverses the space of species trees, we implement two efficient MCMC proposals: the first is based o...

Descripción completa

Detalles Bibliográficos
Autores principales: Rannala, Bruce, Yang, Ziheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8562347/
https://www.ncbi.nlm.nih.gov/pubmed/28053140
http://dx.doi.org/10.1093/sysbio/syw119
_version_ 1784593242942078976
author Rannala, Bruce
Yang, Ziheng
author_facet Rannala, Bruce
Yang, Ziheng
author_sort Rannala, Bruce
collection PubMed
description We develop a Bayesian method for inferring the species phylogeny under the multispecies coalescent (MSC) model. To improve the mixing properties of the Markov chain Monte Carlo (MCMC) algorithm that traverses the space of species trees, we implement two efficient MCMC proposals: the first is based on the Subtree Pruning and Regrafting (SPR) algorithm and the second is based on a node-slider algorithm. Like the Nearest-Neighbor Interchange (NNI) algorithm we implemented previously, both new algorithms propose changes to the species tree, while simultaneously altering the gene trees at multiple genetic loci to automatically avoid conflicts with the newly proposed species tree. The method integrates over gene trees, naturally taking account of the uncertainty of gene tree topology and branch lengths given the sequence data. A simulation study was performed to examine the statistical properties of the new method. The method was found to show excellent statistical performance, inferring the correct species tree with near certainty when 10 loci were included in the dataset. The prior on species trees has some impact, particularly for small numbers of loci. We analyzed several previously published datasets (both real and simulated) for rattlesnakes and Philippine shrews, in comparison with alternative methods. The results suggest that the Bayesian coalescent-based method is statistically more efficient than heuristic methods based on summary statistics, and that our implementation is computationally more efficient than alternative full-likelihood methods under the MSC. Parameter estimates for the rattlesnake data suggest drastically different evolutionary dynamics between the nuclear and mitochondrial loci, even though they support largely consistent species trees. We discuss the different challenges facing the marginal likelihood calculation and transmodel MCMC as alternative strategies for estimating posterior probabilities for species trees. [Bayes factor; Bayesian inference; MCMC; multispecies coalescent; nodeslider; species tree; SPR.]
format Online
Article
Text
id pubmed-8562347
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-85623472021-11-03 Efficient Bayesian Species Tree Inference under the Multispecies Coalescent Rannala, Bruce Yang, Ziheng Syst Biol Regular Articles We develop a Bayesian method for inferring the species phylogeny under the multispecies coalescent (MSC) model. To improve the mixing properties of the Markov chain Monte Carlo (MCMC) algorithm that traverses the space of species trees, we implement two efficient MCMC proposals: the first is based on the Subtree Pruning and Regrafting (SPR) algorithm and the second is based on a node-slider algorithm. Like the Nearest-Neighbor Interchange (NNI) algorithm we implemented previously, both new algorithms propose changes to the species tree, while simultaneously altering the gene trees at multiple genetic loci to automatically avoid conflicts with the newly proposed species tree. The method integrates over gene trees, naturally taking account of the uncertainty of gene tree topology and branch lengths given the sequence data. A simulation study was performed to examine the statistical properties of the new method. The method was found to show excellent statistical performance, inferring the correct species tree with near certainty when 10 loci were included in the dataset. The prior on species trees has some impact, particularly for small numbers of loci. We analyzed several previously published datasets (both real and simulated) for rattlesnakes and Philippine shrews, in comparison with alternative methods. The results suggest that the Bayesian coalescent-based method is statistically more efficient than heuristic methods based on summary statistics, and that our implementation is computationally more efficient than alternative full-likelihood methods under the MSC. Parameter estimates for the rattlesnake data suggest drastically different evolutionary dynamics between the nuclear and mitochondrial loci, even though they support largely consistent species trees. We discuss the different challenges facing the marginal likelihood calculation and transmodel MCMC as alternative strategies for estimating posterior probabilities for species trees. [Bayes factor; Bayesian inference; MCMC; multispecies coalescent; nodeslider; species tree; SPR.] Oxford University Press 2017-03-06 /pmc/articles/PMC8562347/ /pubmed/28053140 http://dx.doi.org/10.1093/sysbio/syw119 Text en © The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits non-commercial reuse, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Regular Articles
Rannala, Bruce
Yang, Ziheng
Efficient Bayesian Species Tree Inference under the Multispecies Coalescent
title Efficient Bayesian Species Tree Inference under the Multispecies Coalescent
title_full Efficient Bayesian Species Tree Inference under the Multispecies Coalescent
title_fullStr Efficient Bayesian Species Tree Inference under the Multispecies Coalescent
title_full_unstemmed Efficient Bayesian Species Tree Inference under the Multispecies Coalescent
title_short Efficient Bayesian Species Tree Inference under the Multispecies Coalescent
title_sort efficient bayesian species tree inference under the multispecies coalescent
topic Regular Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8562347/
https://www.ncbi.nlm.nih.gov/pubmed/28053140
http://dx.doi.org/10.1093/sysbio/syw119
work_keys_str_mv AT rannalabruce efficientbayesianspeciestreeinferenceunderthemultispeciescoalescent
AT yangziheng efficientbayesianspeciestreeinferenceunderthemultispeciescoalescent