Cargando…

Stochastic Variational Inference for Bayesian Phylogenetics: A Case of CAT Model

The pattern of molecular evolution varies among gene sites and genes in a genome. By taking into account the complex heterogeneity of evolutionary processes among sites in a genome, Bayesian infinite mixture models of genomic evolution enable robust phylogenetic inference. With large modern data set...

Descripción completa

Detalles Bibliográficos
Autores principales: Dang, Tung, Kishino, Hirohisa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6445300/
https://www.ncbi.nlm.nih.gov/pubmed/30715448
http://dx.doi.org/10.1093/molbev/msz020
_version_ 1783408172569985024
author Dang, Tung
Kishino, Hirohisa
author_facet Dang, Tung
Kishino, Hirohisa
author_sort Dang, Tung
collection PubMed
description The pattern of molecular evolution varies among gene sites and genes in a genome. By taking into account the complex heterogeneity of evolutionary processes among sites in a genome, Bayesian infinite mixture models of genomic evolution enable robust phylogenetic inference. With large modern data sets, however, the computational burden of Markov chain Monte Carlo sampling techniques becomes prohibitive. Here, we have developed a variational Bayesian procedure to speed up the widely used PhyloBayes MPI program, which deals with the heterogeneity of amino acid profiles. Rather than sampling from the posterior distribution, the procedure approximates the (unknown) posterior distribution using a manageable distribution called the variational distribution. The parameters in the variational distribution are estimated by minimizing Kullback–Leibler divergence. To examine performance, we analyzed three empirical data sets consisting of mitochondrial, plastid-encoded, and nuclear proteins. Our variational method accurately approximated the Bayesian inference of phylogenetic tree, mixture proportions, and the amino acid propensity of each component of the mixture while using orders of magnitude less computational time.
format Online
Article
Text
id pubmed-6445300
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-64453002019-04-05 Stochastic Variational Inference for Bayesian Phylogenetics: A Case of CAT Model Dang, Tung Kishino, Hirohisa Mol Biol Evol Methods The pattern of molecular evolution varies among gene sites and genes in a genome. By taking into account the complex heterogeneity of evolutionary processes among sites in a genome, Bayesian infinite mixture models of genomic evolution enable robust phylogenetic inference. With large modern data sets, however, the computational burden of Markov chain Monte Carlo sampling techniques becomes prohibitive. Here, we have developed a variational Bayesian procedure to speed up the widely used PhyloBayes MPI program, which deals with the heterogeneity of amino acid profiles. Rather than sampling from the posterior distribution, the procedure approximates the (unknown) posterior distribution using a manageable distribution called the variational distribution. The parameters in the variational distribution are estimated by minimizing Kullback–Leibler divergence. To examine performance, we analyzed three empirical data sets consisting of mitochondrial, plastid-encoded, and nuclear proteins. Our variational method accurately approximated the Bayesian inference of phylogenetic tree, mixture proportions, and the amino acid propensity of each component of the mixture while using orders of magnitude less computational time. Oxford University Press 2019-04 2019-02-01 /pmc/articles/PMC6445300/ /pubmed/30715448 http://dx.doi.org/10.1093/molbev/msz020 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods
Dang, Tung
Kishino, Hirohisa
Stochastic Variational Inference for Bayesian Phylogenetics: A Case of CAT Model
title Stochastic Variational Inference for Bayesian Phylogenetics: A Case of CAT Model
title_full Stochastic Variational Inference for Bayesian Phylogenetics: A Case of CAT Model
title_fullStr Stochastic Variational Inference for Bayesian Phylogenetics: A Case of CAT Model
title_full_unstemmed Stochastic Variational Inference for Bayesian Phylogenetics: A Case of CAT Model
title_short Stochastic Variational Inference for Bayesian Phylogenetics: A Case of CAT Model
title_sort stochastic variational inference for bayesian phylogenetics: a case of cat model
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6445300/
https://www.ncbi.nlm.nih.gov/pubmed/30715448
http://dx.doi.org/10.1093/molbev/msz020
work_keys_str_mv AT dangtung stochasticvariationalinferenceforbayesianphylogeneticsacaseofcatmodel
AT kishinohirohisa stochasticvariationalinferenceforbayesianphylogeneticsacaseofcatmodel