Cargando…

Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics

In Bayesian phylogenetic inference, marginal likelihoods can be estimated using several different methods, including the path-sampling or stepping-stone-sampling algorithms. Both algorithms are computationally demanding because they require a series of power posterior Markov chain Monte Carlo (MCMC)...

Descripción completa

Detalles Bibliográficos
Autores principales: Höhna, Sebastian, Landis, Michael J., Huelsenbeck, John P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8570164/
https://www.ncbi.nlm.nih.gov/pubmed/34760401
http://dx.doi.org/10.7717/peerj.12438
_version_ 1784594786591703040
author Höhna, Sebastian
Landis, Michael J.
Huelsenbeck, John P.
author_facet Höhna, Sebastian
Landis, Michael J.
Huelsenbeck, John P.
author_sort Höhna, Sebastian
collection PubMed
description In Bayesian phylogenetic inference, marginal likelihoods can be estimated using several different methods, including the path-sampling or stepping-stone-sampling algorithms. Both algorithms are computationally demanding because they require a series of power posterior Markov chain Monte Carlo (MCMC) simulations. Here we introduce a general parallelization strategy that distributes the power posterior MCMC simulations and the likelihood computations over available CPUs. Our parallelization strategy can easily be applied to any statistical model despite our primary focus on molecular substitution models in this study. Using two phylogenetic example datasets, we demonstrate that the runtime of the marginal likelihood estimation can be reduced significantly even if only two CPUs are available (an average performance increase of 1.96x). The performance increase is nearly linear with the number of available CPUs. We record a performance increase of 13.3x for cluster nodes with 16 CPUs, representing a substantial reduction to the runtime of marginal likelihood estimations. Hence, our parallelization strategy enables the estimation of marginal likelihoods to complete in a feasible amount of time which previously needed days, weeks or even months. The methods described here are implemented in our open-source software RevBayes which is available from http://www.RevBayes.com.
format Online
Article
Text
id pubmed-8570164
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-85701642021-11-09 Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics Höhna, Sebastian Landis, Michael J. Huelsenbeck, John P. PeerJ Bioinformatics In Bayesian phylogenetic inference, marginal likelihoods can be estimated using several different methods, including the path-sampling or stepping-stone-sampling algorithms. Both algorithms are computationally demanding because they require a series of power posterior Markov chain Monte Carlo (MCMC) simulations. Here we introduce a general parallelization strategy that distributes the power posterior MCMC simulations and the likelihood computations over available CPUs. Our parallelization strategy can easily be applied to any statistical model despite our primary focus on molecular substitution models in this study. Using two phylogenetic example datasets, we demonstrate that the runtime of the marginal likelihood estimation can be reduced significantly even if only two CPUs are available (an average performance increase of 1.96x). The performance increase is nearly linear with the number of available CPUs. We record a performance increase of 13.3x for cluster nodes with 16 CPUs, representing a substantial reduction to the runtime of marginal likelihood estimations. Hence, our parallelization strategy enables the estimation of marginal likelihoods to complete in a feasible amount of time which previously needed days, weeks or even months. The methods described here are implemented in our open-source software RevBayes which is available from http://www.RevBayes.com. PeerJ Inc. 2021-11-02 /pmc/articles/PMC8570164/ /pubmed/34760401 http://dx.doi.org/10.7717/peerj.12438 Text en ©2021 Höhna et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Höhna, Sebastian
Landis, Michael J.
Huelsenbeck, John P.
Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
title Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
title_full Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
title_fullStr Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
title_full_unstemmed Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
title_short Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
title_sort parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8570164/
https://www.ncbi.nlm.nih.gov/pubmed/34760401
http://dx.doi.org/10.7717/peerj.12438
work_keys_str_mv AT hohnasebastian parallelpowerposterioranalysesforfastcomputationofmarginallikelihoodsinphylogenetics
AT landismichaelj parallelpowerposterioranalysesforfastcomputationofmarginallikelihoodsinphylogenetics
AT huelsenbeckjohnp parallelpowerposterioranalysesforfastcomputationofmarginallikelihoodsinphylogenetics