Cargando…

Starless bias and parameter-estimation bias in the likelihood-based phylogenetic method

I analyzed various site pattern combinations in a 4-OTU case to identify sources of starless bias and parameter-estimation bias in likelihood-based phylogenetic methods, and reported three significant contributions. First, the likelihood method is counterintuitive in that it may not generate a star...

Descripción completa

Detalles Bibliográficos
Autor principal: Xia, Xuhua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: AIMS Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6690233/
https://www.ncbi.nlm.nih.gov/pubmed/31435522
http://dx.doi.org/10.3934/genet.2018.4.212
_version_ 1783443163782840320
author Xia, Xuhua
author_facet Xia, Xuhua
author_sort Xia, Xuhua
collection PubMed
description I analyzed various site pattern combinations in a 4-OTU case to identify sources of starless bias and parameter-estimation bias in likelihood-based phylogenetic methods, and reported three significant contributions. First, the likelihood method is counterintuitive in that it may not generate a star tree with sequences that are equidistant from each other. This behaviour, dubbed starless bias, happens in a 4-OTU tree when there is an excess (i.e., more than expected from a star tree and a substitution model) of conflicting phylogenetic signals supporting the three resolved topologies equally. Special site pattern combinations leading to rejection of a star tree, when sequences are equidistant from each other, were identified. Second, fitting gamma distribution to model rate heterogeneity over sites is strongly confounded with tree topology, especially in conjunction with the starless bias. I present examples to show dramatic differences in the estimated shape parameter α between a star tree and a resolved tree. There may be no rate heterogeneity over sites (with the estimated α > 10000) when a star tree is imposed, but α < 1 (suggesting strong rate heterogeneity over sites) when an (incorrect) resolved tree is imposed. Thus, the dependence of “rate heterogeneity” on tree topology implies that “rate heterogeneity” is not a sequence-specific feature, cautioning against interpreting a small α to mean that some sites are under strong purifying selection and others not. Thirdly, because there is no existing (and working) likelihood method for evaluating a star tree with continuous gamma-distributed rate, I have implemented the method for JC69 in a self-contained R script for a four-OTU tree (star or resolved), in addition to another R script assuming a constant rate over sites. These R scripts should be useful for teaching and exploring likelihood methods in phylogenetics.
format Online
Article
Text
id pubmed-6690233
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher AIMS Press
record_format MEDLINE/PubMed
spelling pubmed-66902332019-08-21 Starless bias and parameter-estimation bias in the likelihood-based phylogenetic method Xia, Xuhua AIMS Genet Research Article I analyzed various site pattern combinations in a 4-OTU case to identify sources of starless bias and parameter-estimation bias in likelihood-based phylogenetic methods, and reported three significant contributions. First, the likelihood method is counterintuitive in that it may not generate a star tree with sequences that are equidistant from each other. This behaviour, dubbed starless bias, happens in a 4-OTU tree when there is an excess (i.e., more than expected from a star tree and a substitution model) of conflicting phylogenetic signals supporting the three resolved topologies equally. Special site pattern combinations leading to rejection of a star tree, when sequences are equidistant from each other, were identified. Second, fitting gamma distribution to model rate heterogeneity over sites is strongly confounded with tree topology, especially in conjunction with the starless bias. I present examples to show dramatic differences in the estimated shape parameter α between a star tree and a resolved tree. There may be no rate heterogeneity over sites (with the estimated α > 10000) when a star tree is imposed, but α < 1 (suggesting strong rate heterogeneity over sites) when an (incorrect) resolved tree is imposed. Thus, the dependence of “rate heterogeneity” on tree topology implies that “rate heterogeneity” is not a sequence-specific feature, cautioning against interpreting a small α to mean that some sites are under strong purifying selection and others not. Thirdly, because there is no existing (and working) likelihood method for evaluating a star tree with continuous gamma-distributed rate, I have implemented the method for JC69 in a self-contained R script for a four-OTU tree (star or resolved), in addition to another R script assuming a constant rate over sites. These R scripts should be useful for teaching and exploring likelihood methods in phylogenetics. AIMS Press 2019-04-09 /pmc/articles/PMC6690233/ /pubmed/31435522 http://dx.doi.org/10.3934/genet.2018.4.212 Text en © 2018 the Author(s), licensee AIMS Press This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
spellingShingle Research Article
Xia, Xuhua
Starless bias and parameter-estimation bias in the likelihood-based phylogenetic method
title Starless bias and parameter-estimation bias in the likelihood-based phylogenetic method
title_full Starless bias and parameter-estimation bias in the likelihood-based phylogenetic method
title_fullStr Starless bias and parameter-estimation bias in the likelihood-based phylogenetic method
title_full_unstemmed Starless bias and parameter-estimation bias in the likelihood-based phylogenetic method
title_short Starless bias and parameter-estimation bias in the likelihood-based phylogenetic method
title_sort starless bias and parameter-estimation bias in the likelihood-based phylogenetic method
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6690233/
https://www.ncbi.nlm.nih.gov/pubmed/31435522
http://dx.doi.org/10.3934/genet.2018.4.212
work_keys_str_mv AT xiaxuhua starlessbiasandparameterestimationbiasinthelikelihoodbasedphylogeneticmethod