Cargando…

CodonTest: Modeling Amino Acid Substitution Preferences in Coding Sequences

Codon models of evolution have facilitated the interpretation of selective forces operating on genomes. These models, however, assume a single rate of non-synonymous substitution irrespective of the nature of amino acids being exchanged. Recent developments have shown that models which allow for ami...

Descripción completa

Detalles Bibliográficos
Autores principales: Delport, Wayne, Scheffler, Konrad, Botha, Gordon, Gravenor, Mike B., Muse, Spencer V., Kosakovsky Pond, Sergei L.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2924240/
https://www.ncbi.nlm.nih.gov/pubmed/20808876
http://dx.doi.org/10.1371/journal.pcbi.1000885
_version_ 1782185554230640640
author Delport, Wayne
Scheffler, Konrad
Botha, Gordon
Gravenor, Mike B.
Muse, Spencer V.
Kosakovsky Pond, Sergei L.
author_facet Delport, Wayne
Scheffler, Konrad
Botha, Gordon
Gravenor, Mike B.
Muse, Spencer V.
Kosakovsky Pond, Sergei L.
author_sort Delport, Wayne
collection PubMed
description Codon models of evolution have facilitated the interpretation of selective forces operating on genomes. These models, however, assume a single rate of non-synonymous substitution irrespective of the nature of amino acids being exchanged. Recent developments have shown that models which allow for amino acid pairs to have independent rates of substitution offer improved fit over single rate models. However, these approaches have been limited by the necessity for large alignments in their estimation. An alternative approach is to assume that substitution rates between amino acid pairs can be subdivided into [Image: see text] rate classes, dependent on the information content of the alignment. However, given the combinatorially large number of such models, an efficient model search strategy is needed. Here we develop a Genetic Algorithm (GA) method for the estimation of such models. A GA is used to assign amino acid substitution pairs to a series of [Image: see text] rate classes, where [Image: see text] is estimated from the alignment. Other parameters of the phylogenetic Markov model, including substitution rates, character frequencies and branch lengths are estimated using standard maximum likelihood optimization procedures. We apply the GA to empirical alignments and show improved model fit over existing models of codon evolution. Our results suggest that current models are poor approximations of protein evolution and thus gene and organism specific multi-rate models that incorporate amino acid substitution biases are preferred. We further anticipate that the clustering of amino acid substitution rates into classes will be biologically informative, such that genes with similar functions exhibit similar clustering, and hence this clustering will be useful for the evolutionary fingerprinting of genes.
format Text
id pubmed-2924240
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-29242402010-08-31 CodonTest: Modeling Amino Acid Substitution Preferences in Coding Sequences Delport, Wayne Scheffler, Konrad Botha, Gordon Gravenor, Mike B. Muse, Spencer V. Kosakovsky Pond, Sergei L. PLoS Comput Biol Research Article Codon models of evolution have facilitated the interpretation of selective forces operating on genomes. These models, however, assume a single rate of non-synonymous substitution irrespective of the nature of amino acids being exchanged. Recent developments have shown that models which allow for amino acid pairs to have independent rates of substitution offer improved fit over single rate models. However, these approaches have been limited by the necessity for large alignments in their estimation. An alternative approach is to assume that substitution rates between amino acid pairs can be subdivided into [Image: see text] rate classes, dependent on the information content of the alignment. However, given the combinatorially large number of such models, an efficient model search strategy is needed. Here we develop a Genetic Algorithm (GA) method for the estimation of such models. A GA is used to assign amino acid substitution pairs to a series of [Image: see text] rate classes, where [Image: see text] is estimated from the alignment. Other parameters of the phylogenetic Markov model, including substitution rates, character frequencies and branch lengths are estimated using standard maximum likelihood optimization procedures. We apply the GA to empirical alignments and show improved model fit over existing models of codon evolution. Our results suggest that current models are poor approximations of protein evolution and thus gene and organism specific multi-rate models that incorporate amino acid substitution biases are preferred. We further anticipate that the clustering of amino acid substitution rates into classes will be biologically informative, such that genes with similar functions exhibit similar clustering, and hence this clustering will be useful for the evolutionary fingerprinting of genes. Public Library of Science 2010-08-19 /pmc/articles/PMC2924240/ /pubmed/20808876 http://dx.doi.org/10.1371/journal.pcbi.1000885 Text en Delport et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Delport, Wayne
Scheffler, Konrad
Botha, Gordon
Gravenor, Mike B.
Muse, Spencer V.
Kosakovsky Pond, Sergei L.
CodonTest: Modeling Amino Acid Substitution Preferences in Coding Sequences
title CodonTest: Modeling Amino Acid Substitution Preferences in Coding Sequences
title_full CodonTest: Modeling Amino Acid Substitution Preferences in Coding Sequences
title_fullStr CodonTest: Modeling Amino Acid Substitution Preferences in Coding Sequences
title_full_unstemmed CodonTest: Modeling Amino Acid Substitution Preferences in Coding Sequences
title_short CodonTest: Modeling Amino Acid Substitution Preferences in Coding Sequences
title_sort codontest: modeling amino acid substitution preferences in coding sequences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2924240/
https://www.ncbi.nlm.nih.gov/pubmed/20808876
http://dx.doi.org/10.1371/journal.pcbi.1000885
work_keys_str_mv AT delportwayne codontestmodelingaminoacidsubstitutionpreferencesincodingsequences
AT schefflerkonrad codontestmodelingaminoacidsubstitutionpreferencesincodingsequences
AT bothagordon codontestmodelingaminoacidsubstitutionpreferencesincodingsequences
AT gravenormikeb codontestmodelingaminoacidsubstitutionpreferencesincodingsequences
AT musespencerv codontestmodelingaminoacidsubstitutionpreferencesincodingsequences
AT kosakovskypondsergeil codontestmodelingaminoacidsubstitutionpreferencesincodingsequences