Cargando…
Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics
Recent advances in statistical machine learning techniques have led to the creation of probabilistic programming frameworks. These frameworks enable probabilistic models to be rapidly prototyped and fit to data using scalable approximation methods such as variational inference. In this work, we expl...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6966998/ https://www.ncbi.nlm.nih.gov/pubmed/31976168 http://dx.doi.org/10.7717/peerj.8272 |
_version_ | 1783488860902129664 |
---|---|
author | Fourment, Mathieu Darling, Aaron E. |
author_facet | Fourment, Mathieu Darling, Aaron E. |
author_sort | Fourment, Mathieu |
collection | PubMed |
description | Recent advances in statistical machine learning techniques have led to the creation of probabilistic programming frameworks. These frameworks enable probabilistic models to be rapidly prototyped and fit to data using scalable approximation methods such as variational inference. In this work, we explore the use of the Stan language for probabilistic programming in application to phylogenetic models. We show that many commonly used phylogenetic models including the general time reversible substitution model, rate heterogeneity among sites, and a range of coalescent models can be implemented using a probabilistic programming language. The posterior probability distributions obtained via the black box variational inference engine in Stan were compared to those obtained with reference implementations of Markov chain Monte Carlo (MCMC) for phylogenetic inference. We find that black box variational inference in Stan is less accurate than MCMC methods for phylogenetic models, but requires far less compute time. Finally, we evaluate a custom implementation of mean-field variational inference on the Jukes–Cantor substitution model and show that a specialized implementation of variational inference can be two orders of magnitude faster and more accurate than a general purpose probabilistic implementation. |
format | Online Article Text |
id | pubmed-6966998 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-69669982020-01-23 Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics Fourment, Mathieu Darling, Aaron E. PeerJ Bioinformatics Recent advances in statistical machine learning techniques have led to the creation of probabilistic programming frameworks. These frameworks enable probabilistic models to be rapidly prototyped and fit to data using scalable approximation methods such as variational inference. In this work, we explore the use of the Stan language for probabilistic programming in application to phylogenetic models. We show that many commonly used phylogenetic models including the general time reversible substitution model, rate heterogeneity among sites, and a range of coalescent models can be implemented using a probabilistic programming language. The posterior probability distributions obtained via the black box variational inference engine in Stan were compared to those obtained with reference implementations of Markov chain Monte Carlo (MCMC) for phylogenetic inference. We find that black box variational inference in Stan is less accurate than MCMC methods for phylogenetic models, but requires far less compute time. Finally, we evaluate a custom implementation of mean-field variational inference on the Jukes–Cantor substitution model and show that a specialized implementation of variational inference can be two orders of magnitude faster and more accurate than a general purpose probabilistic implementation. PeerJ Inc. 2019-12-18 /pmc/articles/PMC6966998/ /pubmed/31976168 http://dx.doi.org/10.7717/peerj.8272 Text en © 2019 Fourment and Darling https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Fourment, Mathieu Darling, Aaron E. Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics |
title | Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics |
title_full | Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics |
title_fullStr | Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics |
title_full_unstemmed | Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics |
title_short | Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics |
title_sort | evaluating probabilistic programming and fast variational bayesian inference in phylogenetics |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6966998/ https://www.ncbi.nlm.nih.gov/pubmed/31976168 http://dx.doi.org/10.7717/peerj.8272 |
work_keys_str_mv | AT fourmentmathieu evaluatingprobabilisticprogrammingandfastvariationalbayesianinferenceinphylogenetics AT darlingaarone evaluatingprobabilisticprogrammingandfastvariationalbayesianinferenceinphylogenetics |