Cargando…

Estimating Bayesian Phylogenetic Information Content

Measuring the phylogenetic information content of data has a long history in systematics. Here we explore a Bayesian approach to information content estimation. The entropy of the posterior distribution compared with the entropy of the prior distribution provides a natural way to measure information...

Descripción completa

Detalles Bibliográficos
Autores principales: Lewis, Paul O., Chen, Ming-Hui, Kuo, Lynn, Lewis, Louise A., Fučíková, Karolina, Neupane, Suman, Wang, Yu-Bo, Shi, Daoyuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5066063/
https://www.ncbi.nlm.nih.gov/pubmed/27155008
http://dx.doi.org/10.1093/sysbio/syw042
_version_ 1782460413673209856
author Lewis, Paul O.
Chen, Ming-Hui
Kuo, Lynn
Lewis, Louise A.
Fučíková, Karolina
Neupane, Suman
Wang, Yu-Bo
Shi, Daoyuan
author_facet Lewis, Paul O.
Chen, Ming-Hui
Kuo, Lynn
Lewis, Louise A.
Fučíková, Karolina
Neupane, Suman
Wang, Yu-Bo
Shi, Daoyuan
author_sort Lewis, Paul O.
collection PubMed
description Measuring the phylogenetic information content of data has a long history in systematics. Here we explore a Bayesian approach to information content estimation. The entropy of the posterior distribution compared with the entropy of the prior distribution provides a natural way to measure information content. If the data have no information relevant to ranking tree topologies beyond the information supplied by the prior, the posterior and prior will be identical. Information in data discourages consideration of some hypotheses allowed by the prior, resulting in a posterior distribution that is more concentrated (has lower entropy) than the prior. We focus on measuring information about tree topology using marginal posterior distributions of tree topologies. We show that both the accuracy and the computational efficiency of topological information content estimation improve with use of the conditional clade distribution, which also allows topological information content to be partitioned by clade. We explore two important applications of our method: providing a compelling definition of saturation and detecting conflict among data partitions that can negatively affect analyses of concatenated data. [Bayesian; concatenation; conditional clade distribution; entropy; information; phylogenetics; saturation.]
format Online
Article
Text
id pubmed-5066063
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-50660632016-10-18 Estimating Bayesian Phylogenetic Information Content Lewis, Paul O. Chen, Ming-Hui Kuo, Lynn Lewis, Louise A. Fučíková, Karolina Neupane, Suman Wang, Yu-Bo Shi, Daoyuan Syst Biol Regular Articles Measuring the phylogenetic information content of data has a long history in systematics. Here we explore a Bayesian approach to information content estimation. The entropy of the posterior distribution compared with the entropy of the prior distribution provides a natural way to measure information content. If the data have no information relevant to ranking tree topologies beyond the information supplied by the prior, the posterior and prior will be identical. Information in data discourages consideration of some hypotheses allowed by the prior, resulting in a posterior distribution that is more concentrated (has lower entropy) than the prior. We focus on measuring information about tree topology using marginal posterior distributions of tree topologies. We show that both the accuracy and the computational efficiency of topological information content estimation improve with use of the conditional clade distribution, which also allows topological information content to be partitioned by clade. We explore two important applications of our method: providing a compelling definition of saturation and detecting conflict among data partitions that can negatively affect analyses of concatenated data. [Bayesian; concatenation; conditional clade distribution; entropy; information; phylogenetics; saturation.] Oxford University Press 2016-11 2016-05-06 /pmc/articles/PMC5066063/ /pubmed/27155008 http://dx.doi.org/10.1093/sysbio/syw042 Text en © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Regular Articles
Lewis, Paul O.
Chen, Ming-Hui
Kuo, Lynn
Lewis, Louise A.
Fučíková, Karolina
Neupane, Suman
Wang, Yu-Bo
Shi, Daoyuan
Estimating Bayesian Phylogenetic Information Content
title Estimating Bayesian Phylogenetic Information Content
title_full Estimating Bayesian Phylogenetic Information Content
title_fullStr Estimating Bayesian Phylogenetic Information Content
title_full_unstemmed Estimating Bayesian Phylogenetic Information Content
title_short Estimating Bayesian Phylogenetic Information Content
title_sort estimating bayesian phylogenetic information content
topic Regular Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5066063/
https://www.ncbi.nlm.nih.gov/pubmed/27155008
http://dx.doi.org/10.1093/sysbio/syw042
work_keys_str_mv AT lewispaulo estimatingbayesianphylogeneticinformationcontent
AT chenminghui estimatingbayesianphylogeneticinformationcontent
AT kuolynn estimatingbayesianphylogeneticinformationcontent
AT lewislouisea estimatingbayesianphylogeneticinformationcontent
AT fucikovakarolina estimatingbayesianphylogeneticinformationcontent
AT neupanesuman estimatingbayesianphylogeneticinformationcontent
AT wangyubo estimatingbayesianphylogeneticinformationcontent
AT shidaoyuan estimatingbayesianphylogeneticinformationcontent