Cargando…
Assessing optimal: inequalities in codon optimization algorithms
BACKGROUND: Custom genes have become a common resource in recombinant biology over the last 20 years due to the plummeting cost of DNA synthesis. These genes are often “optimized” to non-native sequences for overexpression in a non-native host by substituting synonymous codons within the coding DNA...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7893858/ https://www.ncbi.nlm.nih.gov/pubmed/33607980 http://dx.doi.org/10.1186/s12915-021-00968-8 |
_version_ | 1783653131861622784 |
---|---|
author | Ranaghan, Matthew J. Li, Jeffrey J. Laprise, Dylan M. Garvie, Colin W. |
author_facet | Ranaghan, Matthew J. Li, Jeffrey J. Laprise, Dylan M. Garvie, Colin W. |
author_sort | Ranaghan, Matthew J. |
collection | PubMed |
description | BACKGROUND: Custom genes have become a common resource in recombinant biology over the last 20 years due to the plummeting cost of DNA synthesis. These genes are often “optimized” to non-native sequences for overexpression in a non-native host by substituting synonymous codons within the coding DNA sequence (CDS). A handful of studies have compared native and optimized CDSs, reporting different levels of soluble product due to the accumulation of misfolded aggregates, variable activity of enzymes, and (at least one report of) a change in substrate specificity. No study, to the best of our knowledge, has performed a practical comparison of CDSs generated from different codon optimization algorithms or reported the corresponding protein yields. RESULTS: In our efforts to understand what factors constitute an optimized CDS, we identified that there is little consensus among codon-optimization algorithms, a roughly equivalent chance that an algorithm-optimized CDS will increase or diminish recombinant yields as compared to the native DNA, a near ubiquitous use of a codon database that was last updated in 2007, and a high variability of output CDSs by some algorithms. We present a case study, using KRas4B, to demonstrate that a median codon frequency may be a better predictor of soluble yields than the more commonly utilized CAI metric. CONCLUSIONS: We present a method for visualizing, analyzing, and comparing algorithm-optimized DNA sequences for recombinant protein expression. We encourage researchers to consider if DNA optimization is right for their experiments, and work towards improving the reproducibility of published recombinant work by publishing non-native CDSs. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12915-021-00968-8. |
format | Online Article Text |
id | pubmed-7893858 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-78938582021-02-22 Assessing optimal: inequalities in codon optimization algorithms Ranaghan, Matthew J. Li, Jeffrey J. Laprise, Dylan M. Garvie, Colin W. BMC Biol Methodology Article BACKGROUND: Custom genes have become a common resource in recombinant biology over the last 20 years due to the plummeting cost of DNA synthesis. These genes are often “optimized” to non-native sequences for overexpression in a non-native host by substituting synonymous codons within the coding DNA sequence (CDS). A handful of studies have compared native and optimized CDSs, reporting different levels of soluble product due to the accumulation of misfolded aggregates, variable activity of enzymes, and (at least one report of) a change in substrate specificity. No study, to the best of our knowledge, has performed a practical comparison of CDSs generated from different codon optimization algorithms or reported the corresponding protein yields. RESULTS: In our efforts to understand what factors constitute an optimized CDS, we identified that there is little consensus among codon-optimization algorithms, a roughly equivalent chance that an algorithm-optimized CDS will increase or diminish recombinant yields as compared to the native DNA, a near ubiquitous use of a codon database that was last updated in 2007, and a high variability of output CDSs by some algorithms. We present a case study, using KRas4B, to demonstrate that a median codon frequency may be a better predictor of soluble yields than the more commonly utilized CAI metric. CONCLUSIONS: We present a method for visualizing, analyzing, and comparing algorithm-optimized DNA sequences for recombinant protein expression. We encourage researchers to consider if DNA optimization is right for their experiments, and work towards improving the reproducibility of published recombinant work by publishing non-native CDSs. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12915-021-00968-8. BioMed Central 2021-02-19 /pmc/articles/PMC7893858/ /pubmed/33607980 http://dx.doi.org/10.1186/s12915-021-00968-8 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Article Ranaghan, Matthew J. Li, Jeffrey J. Laprise, Dylan M. Garvie, Colin W. Assessing optimal: inequalities in codon optimization algorithms |
title | Assessing optimal: inequalities in codon optimization algorithms |
title_full | Assessing optimal: inequalities in codon optimization algorithms |
title_fullStr | Assessing optimal: inequalities in codon optimization algorithms |
title_full_unstemmed | Assessing optimal: inequalities in codon optimization algorithms |
title_short | Assessing optimal: inequalities in codon optimization algorithms |
title_sort | assessing optimal: inequalities in codon optimization algorithms |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7893858/ https://www.ncbi.nlm.nih.gov/pubmed/33607980 http://dx.doi.org/10.1186/s12915-021-00968-8 |
work_keys_str_mv | AT ranaghanmatthewj assessingoptimalinequalitiesincodonoptimizationalgorithms AT lijeffreyj assessingoptimalinequalitiesincodonoptimizationalgorithms AT laprisedylanm assessingoptimalinequalitiesincodonoptimizationalgorithms AT garviecolinw assessingoptimalinequalitiesincodonoptimizationalgorithms |