Cargando…

OBAMA: OBAMA for Bayesian amino-acid model averaging

BACKGROUND: Bayesian analyses offer many benefits for phylogenetic, and have been popular for analysis of amino acid alignments. It is necessary to specify a substitution and site model for such analyses, and often an ad hoc, or likelihood based method is employed for choosing these models that are...

Descripción completa

Detalles Bibliográficos
Autor principal: Bouckaert, Remco R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7413081/
https://www.ncbi.nlm.nih.gov/pubmed/32832259
http://dx.doi.org/10.7717/peerj.9460
_version_ 1783568733880451072
author Bouckaert, Remco R.
author_facet Bouckaert, Remco R.
author_sort Bouckaert, Remco R.
collection PubMed
description BACKGROUND: Bayesian analyses offer many benefits for phylogenetic, and have been popular for analysis of amino acid alignments. It is necessary to specify a substitution and site model for such analyses, and often an ad hoc, or likelihood based method is employed for choosing these models that are typically of no interest to the analysis overall. METHODS: We present a method called OBAMA that averages over substitution models and site models, thus letting the data inform model choices and taking model uncertainty into account. It uses trans-dimensional Markov Chain Monte Carlo (MCMC) proposals to switch between various empirical substitution models for amino acids such as Dayhoff, WAG, and JTT. Furthermore, it switches base frequencies from these substitution models or use base frequencies estimated based on the alignment. Finally, it switches between using gamma rate heterogeneity or not, and between using a proportion of invariable sites or not. RESULTS: We show that the model performs well in a simulation study. By using appropriate priors, we demonstrate both proportion of invariable sites and the shape parameter for gamma rate heterogeneity can be estimated. The OBAMA method allows taking in account model uncertainty, thus reducing bias in phylogenetic estimates. The method is implemented in the OBAMA package in BEAST 2, which is open source licensed under LGPL and allows joint tree inference under a wide range of models.
format Online
Article
Text
id pubmed-7413081
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-74130812020-08-21 OBAMA: OBAMA for Bayesian amino-acid model averaging Bouckaert, Remco R. PeerJ Bioinformatics BACKGROUND: Bayesian analyses offer many benefits for phylogenetic, and have been popular for analysis of amino acid alignments. It is necessary to specify a substitution and site model for such analyses, and often an ad hoc, or likelihood based method is employed for choosing these models that are typically of no interest to the analysis overall. METHODS: We present a method called OBAMA that averages over substitution models and site models, thus letting the data inform model choices and taking model uncertainty into account. It uses trans-dimensional Markov Chain Monte Carlo (MCMC) proposals to switch between various empirical substitution models for amino acids such as Dayhoff, WAG, and JTT. Furthermore, it switches base frequencies from these substitution models or use base frequencies estimated based on the alignment. Finally, it switches between using gamma rate heterogeneity or not, and between using a proportion of invariable sites or not. RESULTS: We show that the model performs well in a simulation study. By using appropriate priors, we demonstrate both proportion of invariable sites and the shape parameter for gamma rate heterogeneity can be estimated. The OBAMA method allows taking in account model uncertainty, thus reducing bias in phylogenetic estimates. The method is implemented in the OBAMA package in BEAST 2, which is open source licensed under LGPL and allows joint tree inference under a wide range of models. PeerJ Inc. 2020-08-04 /pmc/articles/PMC7413081/ /pubmed/32832259 http://dx.doi.org/10.7717/peerj.9460 Text en ©2020 Bouckaert https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Bouckaert, Remco R.
OBAMA: OBAMA for Bayesian amino-acid model averaging
title OBAMA: OBAMA for Bayesian amino-acid model averaging
title_full OBAMA: OBAMA for Bayesian amino-acid model averaging
title_fullStr OBAMA: OBAMA for Bayesian amino-acid model averaging
title_full_unstemmed OBAMA: OBAMA for Bayesian amino-acid model averaging
title_short OBAMA: OBAMA for Bayesian amino-acid model averaging
title_sort obama: obama for bayesian amino-acid model averaging
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7413081/
https://www.ncbi.nlm.nih.gov/pubmed/32832259
http://dx.doi.org/10.7717/peerj.9460
work_keys_str_mv AT bouckaertremcor obamaobamaforbayesianaminoacidmodelaveraging