Cargando…

Relative Efficiencies of Simple and Complex Substitution Models in Estimating Divergence Times in Phylogenomics

The conventional wisdom in molecular evolution is to apply parameter-rich models of nucleotide and amino acid substitutions for estimating divergence times. However, the actual extent of the difference between time estimates produced by highly complex models compared with those from simple models is...

Descripción completa

Detalles Bibliográficos
Autores principales: Tao, Qiqing, Barba-Montoya, Jose, Huuki, Louise A, Durnan, Mary Kathleen, Kumar, Sudhir
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7253201/
https://www.ncbi.nlm.nih.gov/pubmed/32119075
http://dx.doi.org/10.1093/molbev/msaa049
_version_ 1783539297131954176
author Tao, Qiqing
Barba-Montoya, Jose
Huuki, Louise A
Durnan, Mary Kathleen
Kumar, Sudhir
author_facet Tao, Qiqing
Barba-Montoya, Jose
Huuki, Louise A
Durnan, Mary Kathleen
Kumar, Sudhir
author_sort Tao, Qiqing
collection PubMed
description The conventional wisdom in molecular evolution is to apply parameter-rich models of nucleotide and amino acid substitutions for estimating divergence times. However, the actual extent of the difference between time estimates produced by highly complex models compared with those from simple models is yet to be quantified for contemporary data sets that frequently contain sequences from many species and genes. In a reanalysis of many large multispecies alignments from diverse groups of taxa, we found that the use of the simplest models can produce divergence time estimates and credibility intervals similar to those obtained from the complex models applied in the original studies. This result is surprising because the use of simple models underestimates sequence divergence for all the data sets analyzed. We found three fundamental reasons for the observed robustness of time estimates to model complexity in many practical data sets. First, the estimates of branch lengths and node-to-tip distances under the simplest model show an approximately linear relationship with those produced by using the most complex models applied on data sets with many sequences. Second, relaxed clock methods automatically adjust rates on branches that experience considerable underestimation of sequence divergences, resulting in time estimates that are similar to those from complex models. And, third, the inclusion of even a few good calibrations in an analysis can reduce the difference in time estimates from simple and complex models. The robustness of time estimates to model complexity in these empirical data analyses is encouraging, because all phylogenomics studies use statistical models that are oversimplified descriptions of actual evolutionary substitution processes.
format Online
Article
Text
id pubmed-7253201
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-72532012020-06-02 Relative Efficiencies of Simple and Complex Substitution Models in Estimating Divergence Times in Phylogenomics Tao, Qiqing Barba-Montoya, Jose Huuki, Louise A Durnan, Mary Kathleen Kumar, Sudhir Mol Biol Evol Methods The conventional wisdom in molecular evolution is to apply parameter-rich models of nucleotide and amino acid substitutions for estimating divergence times. However, the actual extent of the difference between time estimates produced by highly complex models compared with those from simple models is yet to be quantified for contemporary data sets that frequently contain sequences from many species and genes. In a reanalysis of many large multispecies alignments from diverse groups of taxa, we found that the use of the simplest models can produce divergence time estimates and credibility intervals similar to those obtained from the complex models applied in the original studies. This result is surprising because the use of simple models underestimates sequence divergence for all the data sets analyzed. We found three fundamental reasons for the observed robustness of time estimates to model complexity in many practical data sets. First, the estimates of branch lengths and node-to-tip distances under the simplest model show an approximately linear relationship with those produced by using the most complex models applied on data sets with many sequences. Second, relaxed clock methods automatically adjust rates on branches that experience considerable underestimation of sequence divergences, resulting in time estimates that are similar to those from complex models. And, third, the inclusion of even a few good calibrations in an analysis can reduce the difference in time estimates from simple and complex models. The robustness of time estimates to model complexity in these empirical data analyses is encouraging, because all phylogenomics studies use statistical models that are oversimplified descriptions of actual evolutionary substitution processes. Oxford University Press 2020-06 2020-03-02 /pmc/articles/PMC7253201/ /pubmed/32119075 http://dx.doi.org/10.1093/molbev/msaa049 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods
Tao, Qiqing
Barba-Montoya, Jose
Huuki, Louise A
Durnan, Mary Kathleen
Kumar, Sudhir
Relative Efficiencies of Simple and Complex Substitution Models in Estimating Divergence Times in Phylogenomics
title Relative Efficiencies of Simple and Complex Substitution Models in Estimating Divergence Times in Phylogenomics
title_full Relative Efficiencies of Simple and Complex Substitution Models in Estimating Divergence Times in Phylogenomics
title_fullStr Relative Efficiencies of Simple and Complex Substitution Models in Estimating Divergence Times in Phylogenomics
title_full_unstemmed Relative Efficiencies of Simple and Complex Substitution Models in Estimating Divergence Times in Phylogenomics
title_short Relative Efficiencies of Simple and Complex Substitution Models in Estimating Divergence Times in Phylogenomics
title_sort relative efficiencies of simple and complex substitution models in estimating divergence times in phylogenomics
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7253201/
https://www.ncbi.nlm.nih.gov/pubmed/32119075
http://dx.doi.org/10.1093/molbev/msaa049
work_keys_str_mv AT taoqiqing relativeefficienciesofsimpleandcomplexsubstitutionmodelsinestimatingdivergencetimesinphylogenomics
AT barbamontoyajose relativeefficienciesofsimpleandcomplexsubstitutionmodelsinestimatingdivergencetimesinphylogenomics
AT huukilouisea relativeefficienciesofsimpleandcomplexsubstitutionmodelsinestimatingdivergencetimesinphylogenomics
AT durnanmarykathleen relativeefficienciesofsimpleandcomplexsubstitutionmodelsinestimatingdivergencetimesinphylogenomics
AT kumarsudhir relativeefficienciesofsimpleandcomplexsubstitutionmodelsinestimatingdivergencetimesinphylogenomics