Cargando…
Superiority of a mechanistic codon substitution model even for protein sequences in Phylogenetic analysis
BACKGROUND: Nucleotide and amino acid substitution tendencies are characteristic of each species, organelle, and protein family. Hence, various empirical amino acid substitution rate matrices have needed to be estimated for phylogenetic analysis: JTT, WAG, and LG for nuclear proteins, mtREV for mito...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4225520/ https://www.ncbi.nlm.nih.gov/pubmed/24256155 http://dx.doi.org/10.1186/1471-2148-13-257 |
_version_ | 1782343525982011392 |
---|---|
author | Miyazawa, Sanzo |
author_facet | Miyazawa, Sanzo |
author_sort | Miyazawa, Sanzo |
collection | PubMed |
description | BACKGROUND: Nucleotide and amino acid substitution tendencies are characteristic of each species, organelle, and protein family. Hence, various empirical amino acid substitution rate matrices have needed to be estimated for phylogenetic analysis: JTT, WAG, and LG for nuclear proteins, mtREV for mitochondrial proteins, cpREV10 and cpREV64 for chloroplast-encoded proteins, and FLU for influenza proteins. On the other hand, in a mechanistic codon substitution model, in which each codon substitution rate is proportional to the product of a codon mutation rate and the ratio of fixation depending on the type of amino acid replacement, mutation rates and the strength of selective constraint on amino acids can be tailored to each protein family with additional 11 parameters. As a result, in the evolutionary analysis of codon sequences it outperforms codon substitution models equivalent to empirical amino acid substitution matrices. Is it superior even for amino acid sequences, among which synonymous substitutions cannot be identified? RESULTS: Nucleotide mutations are assumed to occur independently of codon positions but multiple nucleotide changes in infinitesimal time are allowed. Selective constraints on the respective types of amino acid replacements are tailored to each gene with a linear function of a given estimate of selective constraints, which were estimated by maximizing the likelihood of an empirical amino acid or codon substitution frequency matrix, each of JTT, WAG, LG, and KHG. It is shown that the mechanistic codon substitution model with the assumption of equal codon usage yields better values of Akaike and Bayesian information criteria for all three phylogenetic trees of mitochondrial, chloroplast, and influenza-A hemagglutinin proteins than the empirical amino acid substitution models with mtREV, cpREV64, and FLU, which were designed specifically for those protein families, respectively. The variation of selective constraint across sites fits the datasets significantly better than variable codon mutation rates, confirming that substitution rate variations across sites detected by amino acid substitution models are caused primarily by the variation of selective constraint against amino acid substitutions rather than the variation of codon mutation rate. CONCLUSIONS: The mechanistic codon substitution model is superior to amino acid substitution models even in the evolutionary analysis of protein sequences. |
format | Online Article Text |
id | pubmed-4225520 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-42255202014-11-12 Superiority of a mechanistic codon substitution model even for protein sequences in Phylogenetic analysis Miyazawa, Sanzo BMC Evol Biol Research Article BACKGROUND: Nucleotide and amino acid substitution tendencies are characteristic of each species, organelle, and protein family. Hence, various empirical amino acid substitution rate matrices have needed to be estimated for phylogenetic analysis: JTT, WAG, and LG for nuclear proteins, mtREV for mitochondrial proteins, cpREV10 and cpREV64 for chloroplast-encoded proteins, and FLU for influenza proteins. On the other hand, in a mechanistic codon substitution model, in which each codon substitution rate is proportional to the product of a codon mutation rate and the ratio of fixation depending on the type of amino acid replacement, mutation rates and the strength of selective constraint on amino acids can be tailored to each protein family with additional 11 parameters. As a result, in the evolutionary analysis of codon sequences it outperforms codon substitution models equivalent to empirical amino acid substitution matrices. Is it superior even for amino acid sequences, among which synonymous substitutions cannot be identified? RESULTS: Nucleotide mutations are assumed to occur independently of codon positions but multiple nucleotide changes in infinitesimal time are allowed. Selective constraints on the respective types of amino acid replacements are tailored to each gene with a linear function of a given estimate of selective constraints, which were estimated by maximizing the likelihood of an empirical amino acid or codon substitution frequency matrix, each of JTT, WAG, LG, and KHG. It is shown that the mechanistic codon substitution model with the assumption of equal codon usage yields better values of Akaike and Bayesian information criteria for all three phylogenetic trees of mitochondrial, chloroplast, and influenza-A hemagglutinin proteins than the empirical amino acid substitution models with mtREV, cpREV64, and FLU, which were designed specifically for those protein families, respectively. The variation of selective constraint across sites fits the datasets significantly better than variable codon mutation rates, confirming that substitution rate variations across sites detected by amino acid substitution models are caused primarily by the variation of selective constraint against amino acid substitutions rather than the variation of codon mutation rate. CONCLUSIONS: The mechanistic codon substitution model is superior to amino acid substitution models even in the evolutionary analysis of protein sequences. BioMed Central 2013-11-21 /pmc/articles/PMC4225520/ /pubmed/24256155 http://dx.doi.org/10.1186/1471-2148-13-257 Text en Copyright © 2013 Miyazawa; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Miyazawa, Sanzo Superiority of a mechanistic codon substitution model even for protein sequences in Phylogenetic analysis |
title | Superiority of a mechanistic codon substitution model even for protein sequences in Phylogenetic analysis |
title_full | Superiority of a mechanistic codon substitution model even for protein sequences in Phylogenetic analysis |
title_fullStr | Superiority of a mechanistic codon substitution model even for protein sequences in Phylogenetic analysis |
title_full_unstemmed | Superiority of a mechanistic codon substitution model even for protein sequences in Phylogenetic analysis |
title_short | Superiority of a mechanistic codon substitution model even for protein sequences in Phylogenetic analysis |
title_sort | superiority of a mechanistic codon substitution model even for protein sequences in phylogenetic analysis |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4225520/ https://www.ncbi.nlm.nih.gov/pubmed/24256155 http://dx.doi.org/10.1186/1471-2148-13-257 |
work_keys_str_mv | AT miyazawasanzo superiorityofamechanisticcodonsubstitutionmodelevenforproteinsequencesinphylogeneticanalysis |