Cargando…

Benchmarking Multi-Rate Codon Models

The single rate codon model of non-synonymous substitution is ubiquitous in phylogenetic modeling. Indeed, the use of a non-synonymous to synonymous substitution rate ratio parameter has facilitated the interpretation of selection pressure on genomes. Although the single rate model has achieved wide...

Descripción completa

Detalles Bibliográficos
Autores principales: Delport, Wayne, Scheffler, Konrad, Gravenor, Mike B., Muse, Spencer V., Kosakovsky Pond, Sergei
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2908124/
https://www.ncbi.nlm.nih.gov/pubmed/20657773
http://dx.doi.org/10.1371/journal.pone.0011587
_version_ 1782184161332690944
author Delport, Wayne
Scheffler, Konrad
Gravenor, Mike B.
Muse, Spencer V.
Kosakovsky Pond, Sergei
author_facet Delport, Wayne
Scheffler, Konrad
Gravenor, Mike B.
Muse, Spencer V.
Kosakovsky Pond, Sergei
author_sort Delport, Wayne
collection PubMed
description The single rate codon model of non-synonymous substitution is ubiquitous in phylogenetic modeling. Indeed, the use of a non-synonymous to synonymous substitution rate ratio parameter has facilitated the interpretation of selection pressure on genomes. Although the single rate model has achieved wide acceptance, we argue that the assumption of a single rate of non-synonymous substitution is biologically unreasonable, given observed differences in substitution rates evident from empirical amino acid models. Some have attempted to incorporate amino acid substitution biases into models of codon evolution and have shown improved model performance versus the single rate model. Here, we show that the single rate model of non-synonymous substitution is easily outperformed by a model with multiple non-synonymous rate classes, yet in which amino acid substitution pairs are assigned randomly to these classes. We argue that, since the single rate model is so easy to improve upon, new codon models should not be validated entirely on the basis of improved model fit over this model. Rather, we should strive to both improve on the single rate model and to approximate the general time-reversible model of codon substitution, with as few parameters as possible, so as to reduce model over-fitting. We hint at how this can be achieved with a Genetic Algorithm approach in which rate classes are assigned on the basis of sequence information content.
format Text
id pubmed-2908124
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-29081242010-07-23 Benchmarking Multi-Rate Codon Models Delport, Wayne Scheffler, Konrad Gravenor, Mike B. Muse, Spencer V. Kosakovsky Pond, Sergei PLoS One Research Article The single rate codon model of non-synonymous substitution is ubiquitous in phylogenetic modeling. Indeed, the use of a non-synonymous to synonymous substitution rate ratio parameter has facilitated the interpretation of selection pressure on genomes. Although the single rate model has achieved wide acceptance, we argue that the assumption of a single rate of non-synonymous substitution is biologically unreasonable, given observed differences in substitution rates evident from empirical amino acid models. Some have attempted to incorporate amino acid substitution biases into models of codon evolution and have shown improved model performance versus the single rate model. Here, we show that the single rate model of non-synonymous substitution is easily outperformed by a model with multiple non-synonymous rate classes, yet in which amino acid substitution pairs are assigned randomly to these classes. We argue that, since the single rate model is so easy to improve upon, new codon models should not be validated entirely on the basis of improved model fit over this model. Rather, we should strive to both improve on the single rate model and to approximate the general time-reversible model of codon substitution, with as few parameters as possible, so as to reduce model over-fitting. We hint at how this can be achieved with a Genetic Algorithm approach in which rate classes are assigned on the basis of sequence information content. Public Library of Science 2010-07-21 /pmc/articles/PMC2908124/ /pubmed/20657773 http://dx.doi.org/10.1371/journal.pone.0011587 Text en Delport et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Delport, Wayne
Scheffler, Konrad
Gravenor, Mike B.
Muse, Spencer V.
Kosakovsky Pond, Sergei
Benchmarking Multi-Rate Codon Models
title Benchmarking Multi-Rate Codon Models
title_full Benchmarking Multi-Rate Codon Models
title_fullStr Benchmarking Multi-Rate Codon Models
title_full_unstemmed Benchmarking Multi-Rate Codon Models
title_short Benchmarking Multi-Rate Codon Models
title_sort benchmarking multi-rate codon models
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2908124/
https://www.ncbi.nlm.nih.gov/pubmed/20657773
http://dx.doi.org/10.1371/journal.pone.0011587
work_keys_str_mv AT delportwayne benchmarkingmultiratecodonmodels
AT schefflerkonrad benchmarkingmultiratecodonmodels
AT gravenormikeb benchmarkingmultiratecodonmodels
AT musespencerv benchmarkingmultiratecodonmodels
AT kosakovskypondsergei benchmarkingmultiratecodonmodels