Cargando…

Population Genetics Based Phylogenetics Under Stabilizing Selection for an Optimal Amino Acid Sequence: A Nested Modeling Approach

We present a new phylogenetic approach, selection on amino acids and codons (SelAC), whose substitution rates are based on a nested model linking protein expression to population genetics. Unlike simpler codon models that assume a single substitution matrix for all sites, our model more realisticall...

Descripción completa

Detalles Bibliográficos
Autores principales: Beaulieu, Jeremy M, O’Meara, Brian C, Zaretzki, Russell, Landerer, Cedric, Chai, Juanjuan, Gilchrist, Michael A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6445302/
https://www.ncbi.nlm.nih.gov/pubmed/30521036
http://dx.doi.org/10.1093/molbev/msy222
_version_ 1783408172803817472
author Beaulieu, Jeremy M
O’Meara, Brian C
Zaretzki, Russell
Landerer, Cedric
Chai, Juanjuan
Gilchrist, Michael A
author_facet Beaulieu, Jeremy M
O’Meara, Brian C
Zaretzki, Russell
Landerer, Cedric
Chai, Juanjuan
Gilchrist, Michael A
author_sort Beaulieu, Jeremy M
collection PubMed
description We present a new phylogenetic approach, selection on amino acids and codons (SelAC), whose substitution rates are based on a nested model linking protein expression to population genetics. Unlike simpler codon models that assume a single substitution matrix for all sites, our model more realistically represents the evolution of protein-coding DNA under the assumption of consistent, stabilizing selection using a cost-benefit approach. This cost–benefit approach allows us to generate a set of 20 optimal amino acid-specific matrix families using just a handful of parameters and naturally links the strength of stabilizing selection to protein synthesis levels, which we can estimate. Using a yeast data set of 100 orthologs for 6 taxa, we find SelAC fits the data much better than popular models by 10(4)–10(5) Akike information criterion units adjusted for small sample bias. Our results also indicated that nested, mechanistic models better predict observed data patterns highlighting the improvement in biological realism in amino acid sequence evolution that our model provides. Additional parameters estimated by SelAC indicate that a large amount of nonphylogenetic, but biologically meaningful, information can be inferred from existing data. For example, SelAC prediction of gene-specific protein synthesis rates correlates well with both empirical (r=0.33–0.48) and other theoretical predictions (r=0.45–0.64) for multiple yeast species. SelAC also provides estimates of the optimal amino acid at each site. Finally, because SelAC is a nested approach based on clearly stated biological assumptions, future modifications, such as including shifts in the optimal amino acid sequence within or across lineages, are possible.
format Online
Article
Text
id pubmed-6445302
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-64453022019-04-05 Population Genetics Based Phylogenetics Under Stabilizing Selection for an Optimal Amino Acid Sequence: A Nested Modeling Approach Beaulieu, Jeremy M O’Meara, Brian C Zaretzki, Russell Landerer, Cedric Chai, Juanjuan Gilchrist, Michael A Mol Biol Evol Methods We present a new phylogenetic approach, selection on amino acids and codons (SelAC), whose substitution rates are based on a nested model linking protein expression to population genetics. Unlike simpler codon models that assume a single substitution matrix for all sites, our model more realistically represents the evolution of protein-coding DNA under the assumption of consistent, stabilizing selection using a cost-benefit approach. This cost–benefit approach allows us to generate a set of 20 optimal amino acid-specific matrix families using just a handful of parameters and naturally links the strength of stabilizing selection to protein synthesis levels, which we can estimate. Using a yeast data set of 100 orthologs for 6 taxa, we find SelAC fits the data much better than popular models by 10(4)–10(5) Akike information criterion units adjusted for small sample bias. Our results also indicated that nested, mechanistic models better predict observed data patterns highlighting the improvement in biological realism in amino acid sequence evolution that our model provides. Additional parameters estimated by SelAC indicate that a large amount of nonphylogenetic, but biologically meaningful, information can be inferred from existing data. For example, SelAC prediction of gene-specific protein synthesis rates correlates well with both empirical (r=0.33–0.48) and other theoretical predictions (r=0.45–0.64) for multiple yeast species. SelAC also provides estimates of the optimal amino acid at each site. Finally, because SelAC is a nested approach based on clearly stated biological assumptions, future modifications, such as including shifts in the optimal amino acid sequence within or across lineages, are possible. Oxford University Press 2019-04 2018-12-05 /pmc/articles/PMC6445302/ /pubmed/30521036 http://dx.doi.org/10.1093/molbev/msy222 Text en © The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods
Beaulieu, Jeremy M
O’Meara, Brian C
Zaretzki, Russell
Landerer, Cedric
Chai, Juanjuan
Gilchrist, Michael A
Population Genetics Based Phylogenetics Under Stabilizing Selection for an Optimal Amino Acid Sequence: A Nested Modeling Approach
title Population Genetics Based Phylogenetics Under Stabilizing Selection for an Optimal Amino Acid Sequence: A Nested Modeling Approach
title_full Population Genetics Based Phylogenetics Under Stabilizing Selection for an Optimal Amino Acid Sequence: A Nested Modeling Approach
title_fullStr Population Genetics Based Phylogenetics Under Stabilizing Selection for an Optimal Amino Acid Sequence: A Nested Modeling Approach
title_full_unstemmed Population Genetics Based Phylogenetics Under Stabilizing Selection for an Optimal Amino Acid Sequence: A Nested Modeling Approach
title_short Population Genetics Based Phylogenetics Under Stabilizing Selection for an Optimal Amino Acid Sequence: A Nested Modeling Approach
title_sort population genetics based phylogenetics under stabilizing selection for an optimal amino acid sequence: a nested modeling approach
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6445302/
https://www.ncbi.nlm.nih.gov/pubmed/30521036
http://dx.doi.org/10.1093/molbev/msy222
work_keys_str_mv AT beaulieujeremym populationgeneticsbasedphylogeneticsunderstabilizingselectionforanoptimalaminoacidsequenceanestedmodelingapproach
AT omearabrianc populationgeneticsbasedphylogeneticsunderstabilizingselectionforanoptimalaminoacidsequenceanestedmodelingapproach
AT zaretzkirussell populationgeneticsbasedphylogeneticsunderstabilizingselectionforanoptimalaminoacidsequenceanestedmodelingapproach
AT landerercedric populationgeneticsbasedphylogeneticsunderstabilizingselectionforanoptimalaminoacidsequenceanestedmodelingapproach
AT chaijuanjuan populationgeneticsbasedphylogeneticsunderstabilizingselectionforanoptimalaminoacidsequenceanestedmodelingapproach
AT gilchristmichaela populationgeneticsbasedphylogeneticsunderstabilizingselectionforanoptimalaminoacidsequenceanestedmodelingapproach