Cargando…

EmpPrior: using outside empirical data to inform branch-length priors for Bayesian phylogenetics

BACKGROUND: Branch-length parameters are a central component of phylogenetic models and of intrinsic biological interest. Default branch-length priors in some Bayesian phylogenetic software can be unintentionally informative and lead to branch- and tree-length estimates that are unreasonable. Altern...

Descripción completa

Detalles Bibliográficos
Autores principales: Andersen, John J., Nelson, Bradley J., Brown, Jeremy M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4919878/
https://www.ncbi.nlm.nih.gov/pubmed/27342194
http://dx.doi.org/10.1186/s12859-016-1132-4
_version_ 1782439315284951040
author Andersen, John J.
Nelson, Bradley J.
Brown, Jeremy M.
author_facet Andersen, John J.
Nelson, Bradley J.
Brown, Jeremy M.
author_sort Andersen, John J.
collection PubMed
description BACKGROUND: Branch-length parameters are a central component of phylogenetic models and of intrinsic biological interest. Default branch-length priors in some Bayesian phylogenetic software can be unintentionally informative and lead to branch- and tree-length estimates that are unreasonable. Alternatively, priors may be uninformative, but lead to diffuse posterior estimates. Despite the widespread availability of relevant datasets from other groups, biologists rarely leverage outside information to specify branch-length priors that are specific to the analysis they are conducting. RESULTS: We developed the software package EmpPrior to facilitate the collection and incorporation of relevant, outside information when setting branch-length priors for phylogenetics. EmpPrior efficiently queries TreeBASE to find data that are similar to focal data, in terms of taxonomic and genetic sampling, and uses them to inform branch-length priors for the focal analysis. EmpPrior consists of two components: EmpPrior-search, written in Java to query TreeBASE, and EmpPrior-fit, written in R to parameterize branch-length distributions. In an example analysis, we show how the use of relevant, outside data is made possible by EmpPrior and improves tree-length estimates from a focal dataset. CONCLUSION: EmpPrior is easy to use, fast, and improves both the accuracy and precision of branch-length estimates in many circumstances. While EmpPrior’s focus is on branch lengths, the strategy it employs could easily be extended to address other prior parameterization problems in phylogenetics.
format Online
Article
Text
id pubmed-4919878
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-49198782016-06-28 EmpPrior: using outside empirical data to inform branch-length priors for Bayesian phylogenetics Andersen, John J. Nelson, Bradley J. Brown, Jeremy M. BMC Bioinformatics Software BACKGROUND: Branch-length parameters are a central component of phylogenetic models and of intrinsic biological interest. Default branch-length priors in some Bayesian phylogenetic software can be unintentionally informative and lead to branch- and tree-length estimates that are unreasonable. Alternatively, priors may be uninformative, but lead to diffuse posterior estimates. Despite the widespread availability of relevant datasets from other groups, biologists rarely leverage outside information to specify branch-length priors that are specific to the analysis they are conducting. RESULTS: We developed the software package EmpPrior to facilitate the collection and incorporation of relevant, outside information when setting branch-length priors for phylogenetics. EmpPrior efficiently queries TreeBASE to find data that are similar to focal data, in terms of taxonomic and genetic sampling, and uses them to inform branch-length priors for the focal analysis. EmpPrior consists of two components: EmpPrior-search, written in Java to query TreeBASE, and EmpPrior-fit, written in R to parameterize branch-length distributions. In an example analysis, we show how the use of relevant, outside data is made possible by EmpPrior and improves tree-length estimates from a focal dataset. CONCLUSION: EmpPrior is easy to use, fast, and improves both the accuracy and precision of branch-length estimates in many circumstances. While EmpPrior’s focus is on branch lengths, the strategy it employs could easily be extended to address other prior parameterization problems in phylogenetics. BioMed Central 2016-06-24 /pmc/articles/PMC4919878/ /pubmed/27342194 http://dx.doi.org/10.1186/s12859-016-1132-4 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Andersen, John J.
Nelson, Bradley J.
Brown, Jeremy M.
EmpPrior: using outside empirical data to inform branch-length priors for Bayesian phylogenetics
title EmpPrior: using outside empirical data to inform branch-length priors for Bayesian phylogenetics
title_full EmpPrior: using outside empirical data to inform branch-length priors for Bayesian phylogenetics
title_fullStr EmpPrior: using outside empirical data to inform branch-length priors for Bayesian phylogenetics
title_full_unstemmed EmpPrior: using outside empirical data to inform branch-length priors for Bayesian phylogenetics
title_short EmpPrior: using outside empirical data to inform branch-length priors for Bayesian phylogenetics
title_sort empprior: using outside empirical data to inform branch-length priors for bayesian phylogenetics
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4919878/
https://www.ncbi.nlm.nih.gov/pubmed/27342194
http://dx.doi.org/10.1186/s12859-016-1132-4
work_keys_str_mv AT andersenjohnj emppriorusingoutsideempiricaldatatoinformbranchlengthpriorsforbayesianphylogenetics
AT nelsonbradleyj emppriorusingoutsideempiricaldatatoinformbranchlengthpriorsforbayesianphylogenetics
AT brownjeremym emppriorusingoutsideempiricaldatatoinformbranchlengthpriorsforbayesianphylogenetics