Cargando…

Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics

Recent advances in statistical machine learning techniques have led to the creation of probabilistic programming frameworks. These frameworks enable probabilistic models to be rapidly prototyped and fit to data using scalable approximation methods such as variational inference. In this work, we expl...

Descripción completa

Detalles Bibliográficos
Autores principales: Fourment, Mathieu, Darling, Aaron E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6966998/
https://www.ncbi.nlm.nih.gov/pubmed/31976168
http://dx.doi.org/10.7717/peerj.8272
_version_ 1783488860902129664
author Fourment, Mathieu
Darling, Aaron E.
author_facet Fourment, Mathieu
Darling, Aaron E.
author_sort Fourment, Mathieu
collection PubMed
description Recent advances in statistical machine learning techniques have led to the creation of probabilistic programming frameworks. These frameworks enable probabilistic models to be rapidly prototyped and fit to data using scalable approximation methods such as variational inference. In this work, we explore the use of the Stan language for probabilistic programming in application to phylogenetic models. We show that many commonly used phylogenetic models including the general time reversible substitution model, rate heterogeneity among sites, and a range of coalescent models can be implemented using a probabilistic programming language. The posterior probability distributions obtained via the black box variational inference engine in Stan were compared to those obtained with reference implementations of Markov chain Monte Carlo (MCMC) for phylogenetic inference. We find that black box variational inference in Stan is less accurate than MCMC methods for phylogenetic models, but requires far less compute time. Finally, we evaluate a custom implementation of mean-field variational inference on the Jukes–Cantor substitution model and show that a specialized implementation of variational inference can be two orders of magnitude faster and more accurate than a general purpose probabilistic implementation.
format Online
Article
Text
id pubmed-6966998
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-69669982020-01-23 Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics Fourment, Mathieu Darling, Aaron E. PeerJ Bioinformatics Recent advances in statistical machine learning techniques have led to the creation of probabilistic programming frameworks. These frameworks enable probabilistic models to be rapidly prototyped and fit to data using scalable approximation methods such as variational inference. In this work, we explore the use of the Stan language for probabilistic programming in application to phylogenetic models. We show that many commonly used phylogenetic models including the general time reversible substitution model, rate heterogeneity among sites, and a range of coalescent models can be implemented using a probabilistic programming language. The posterior probability distributions obtained via the black box variational inference engine in Stan were compared to those obtained with reference implementations of Markov chain Monte Carlo (MCMC) for phylogenetic inference. We find that black box variational inference in Stan is less accurate than MCMC methods for phylogenetic models, but requires far less compute time. Finally, we evaluate a custom implementation of mean-field variational inference on the Jukes–Cantor substitution model and show that a specialized implementation of variational inference can be two orders of magnitude faster and more accurate than a general purpose probabilistic implementation. PeerJ Inc. 2019-12-18 /pmc/articles/PMC6966998/ /pubmed/31976168 http://dx.doi.org/10.7717/peerj.8272 Text en © 2019 Fourment and Darling https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Fourment, Mathieu
Darling, Aaron E.
Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics
title Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics
title_full Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics
title_fullStr Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics
title_full_unstemmed Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics
title_short Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics
title_sort evaluating probabilistic programming and fast variational bayesian inference in phylogenetics
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6966998/
https://www.ncbi.nlm.nih.gov/pubmed/31976168
http://dx.doi.org/10.7717/peerj.8272
work_keys_str_mv AT fourmentmathieu evaluatingprobabilisticprogrammingandfastvariationalbayesianinferenceinphylogenetics
AT darlingaarone evaluatingprobabilisticprogrammingandfastvariationalbayesianinferenceinphylogenetics