Cargando…

Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics

Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model...

Descripción completa

Detalles Bibliográficos
Autores principales: Kolaczkowski, Bryan, Thornton, Joseph W.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2785476/
https://www.ncbi.nlm.nih.gov/pubmed/20011052
http://dx.doi.org/10.1371/journal.pone.0007891
_version_ 1782174819025944576
author Kolaczkowski, Bryan
Thornton, Joseph W.
author_facet Kolaczkowski, Bryan
Thornton, Joseph W.
author_sort Kolaczkowski, Bryan
collection PubMed
description Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias—which is apparent under both controlled simulation conditions and in analyses of empirical sequence data—also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages—that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis.
format Text
id pubmed-2785476
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-27854762009-12-10 Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics Kolaczkowski, Bryan Thornton, Joseph W. PLoS One Research Article Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias—which is apparent under both controlled simulation conditions and in analyses of empirical sequence data—also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages—that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis. Public Library of Science 2009-12-09 /pmc/articles/PMC2785476/ /pubmed/20011052 http://dx.doi.org/10.1371/journal.pone.0007891 Text en Kolaczkowski, Thornton. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Kolaczkowski, Bryan
Thornton, Joseph W.
Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics
title Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics
title_full Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics
title_fullStr Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics
title_full_unstemmed Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics
title_short Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics
title_sort long-branch attraction bias and inconsistency in bayesian phylogenetics
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2785476/
https://www.ncbi.nlm.nih.gov/pubmed/20011052
http://dx.doi.org/10.1371/journal.pone.0007891
work_keys_str_mv AT kolaczkowskibryan longbranchattractionbiasandinconsistencyinbayesianphylogenetics
AT thorntonjosephw longbranchattractionbiasandinconsistencyinbayesianphylogenetics