Cargando…

Complex Models of Sequence Evolution Require Accurate Estimators as Exemplified with the Invariable Site Plus Gamma Model

The invariable site plus [Formula: see text] model (I [Formula: see text] is widely used to model rate heterogeneity among alignment sites in maximum likelihood and Bayesian phylogenetic analyses. The proof that the I [Formula: see text] continuous [Formula: see text] model is identifiable (model pa...

Descripción completa

Detalles Bibliográficos
Autores principales: Nguyen, Lam-Tung, von Haeseler, Arndt, Minh, Bui Quang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6204645/
https://www.ncbi.nlm.nih.gov/pubmed/29186593
http://dx.doi.org/10.1093/sysbio/syx092
_version_ 1783366075119828992
author Nguyen, Lam-Tung
von Haeseler, Arndt
Minh, Bui Quang
author_facet Nguyen, Lam-Tung
von Haeseler, Arndt
Minh, Bui Quang
author_sort Nguyen, Lam-Tung
collection PubMed
description The invariable site plus [Formula: see text] model (I [Formula: see text] is widely used to model rate heterogeneity among alignment sites in maximum likelihood and Bayesian phylogenetic analyses. The proof that the I [Formula: see text] continuous [Formula: see text] model is identifiable (model parameters can be inferred correctly given enough data) has increased the creditability of its application to phylogeny reconstruction. However, most phylogenetic software implement the I [Formula: see text] discrete [Formula: see text] model, whose identifiability is likely but unproven. How well the parameters of the I [Formula: see text] discrete [Formula: see text] model are estimated is still disputed. Especially the correlation between the fraction of invariable sites and the fractions of sites with a slow evolutionary rate is discussed as being problematic. We show that optimization heuristics as implemented in frequently used phylogenetic software (PhyML, RAxML, IQ-TREE, and MrBayes) cannot always reliably estimate the shape parameter, the proportion of invariable sites, and the tree length. Here, we propose an improved optimization heuristic that accurately estimates the three parameters. While research efforts mainly focus on tree search methods, our results signify the equal importance of verifying and developing effective estimation methods for complex models of sequence evolution.
format Online
Article
Text
id pubmed-6204645
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-62046452018-10-31 Complex Models of Sequence Evolution Require Accurate Estimators as Exemplified with the Invariable Site Plus Gamma Model Nguyen, Lam-Tung von Haeseler, Arndt Minh, Bui Quang Syst Biol Points of View The invariable site plus [Formula: see text] model (I [Formula: see text] is widely used to model rate heterogeneity among alignment sites in maximum likelihood and Bayesian phylogenetic analyses. The proof that the I [Formula: see text] continuous [Formula: see text] model is identifiable (model parameters can be inferred correctly given enough data) has increased the creditability of its application to phylogeny reconstruction. However, most phylogenetic software implement the I [Formula: see text] discrete [Formula: see text] model, whose identifiability is likely but unproven. How well the parameters of the I [Formula: see text] discrete [Formula: see text] model are estimated is still disputed. Especially the correlation between the fraction of invariable sites and the fractions of sites with a slow evolutionary rate is discussed as being problematic. We show that optimization heuristics as implemented in frequently used phylogenetic software (PhyML, RAxML, IQ-TREE, and MrBayes) cannot always reliably estimate the shape parameter, the proportion of invariable sites, and the tree length. Here, we propose an improved optimization heuristic that accurately estimates the three parameters. While research efforts mainly focus on tree search methods, our results signify the equal importance of verifying and developing effective estimation methods for complex models of sequence evolution. Oxford University Press 2018-05 2017-11-27 /pmc/articles/PMC6204645/ /pubmed/29186593 http://dx.doi.org/10.1093/sysbio/syx092 Text en © The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. For Permissions, please email: journals.permissions@oup.com
spellingShingle Points of View
Nguyen, Lam-Tung
von Haeseler, Arndt
Minh, Bui Quang
Complex Models of Sequence Evolution Require Accurate Estimators as Exemplified with the Invariable Site Plus Gamma Model
title Complex Models of Sequence Evolution Require Accurate Estimators as Exemplified with the Invariable Site Plus Gamma Model
title_full Complex Models of Sequence Evolution Require Accurate Estimators as Exemplified with the Invariable Site Plus Gamma Model
title_fullStr Complex Models of Sequence Evolution Require Accurate Estimators as Exemplified with the Invariable Site Plus Gamma Model
title_full_unstemmed Complex Models of Sequence Evolution Require Accurate Estimators as Exemplified with the Invariable Site Plus Gamma Model
title_short Complex Models of Sequence Evolution Require Accurate Estimators as Exemplified with the Invariable Site Plus Gamma Model
title_sort complex models of sequence evolution require accurate estimators as exemplified with the invariable site plus gamma model
topic Points of View
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6204645/
https://www.ncbi.nlm.nih.gov/pubmed/29186593
http://dx.doi.org/10.1093/sysbio/syx092
work_keys_str_mv AT nguyenlamtung complexmodelsofsequenceevolutionrequireaccurateestimatorsasexemplifiedwiththeinvariablesiteplusgammamodel
AT vonhaeselerarndt complexmodelsofsequenceevolutionrequireaccurateestimatorsasexemplifiedwiththeinvariablesiteplusgammamodel
AT minhbuiquang complexmodelsofsequenceevolutionrequireaccurateestimatorsasexemplifiedwiththeinvariablesiteplusgammamodel