Cargando…

Assessing optimal: inequalities in codon optimization algorithms

BACKGROUND: Custom genes have become a common resource in recombinant biology over the last 20 years due to the plummeting cost of DNA synthesis. These genes are often “optimized” to non-native sequences for overexpression in a non-native host by substituting synonymous codons within the coding DNA...

Descripción completa

Detalles Bibliográficos
Autores principales: Ranaghan, Matthew J., Li, Jeffrey J., Laprise, Dylan M., Garvie, Colin W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7893858/
https://www.ncbi.nlm.nih.gov/pubmed/33607980
http://dx.doi.org/10.1186/s12915-021-00968-8
_version_ 1783653131861622784
author Ranaghan, Matthew J.
Li, Jeffrey J.
Laprise, Dylan M.
Garvie, Colin W.
author_facet Ranaghan, Matthew J.
Li, Jeffrey J.
Laprise, Dylan M.
Garvie, Colin W.
author_sort Ranaghan, Matthew J.
collection PubMed
description BACKGROUND: Custom genes have become a common resource in recombinant biology over the last 20 years due to the plummeting cost of DNA synthesis. These genes are often “optimized” to non-native sequences for overexpression in a non-native host by substituting synonymous codons within the coding DNA sequence (CDS). A handful of studies have compared native and optimized CDSs, reporting different levels of soluble product due to the accumulation of misfolded aggregates, variable activity of enzymes, and (at least one report of) a change in substrate specificity. No study, to the best of our knowledge, has performed a practical comparison of CDSs generated from different codon optimization algorithms or reported the corresponding protein yields. RESULTS: In our efforts to understand what factors constitute an optimized CDS, we identified that there is little consensus among codon-optimization algorithms, a roughly equivalent chance that an algorithm-optimized CDS will increase or diminish recombinant yields as compared to the native DNA, a near ubiquitous use of a codon database that was last updated in 2007, and a high variability of output CDSs by some algorithms. We present a case study, using KRas4B, to demonstrate that a median codon frequency may be a better predictor of soluble yields than the more commonly utilized CAI metric. CONCLUSIONS: We present a method for visualizing, analyzing, and comparing algorithm-optimized DNA sequences for recombinant protein expression. We encourage researchers to consider if DNA optimization is right for their experiments, and work towards improving the reproducibility of published recombinant work by publishing non-native CDSs. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12915-021-00968-8.
format Online
Article
Text
id pubmed-7893858
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-78938582021-02-22 Assessing optimal: inequalities in codon optimization algorithms Ranaghan, Matthew J. Li, Jeffrey J. Laprise, Dylan M. Garvie, Colin W. BMC Biol Methodology Article BACKGROUND: Custom genes have become a common resource in recombinant biology over the last 20 years due to the plummeting cost of DNA synthesis. These genes are often “optimized” to non-native sequences for overexpression in a non-native host by substituting synonymous codons within the coding DNA sequence (CDS). A handful of studies have compared native and optimized CDSs, reporting different levels of soluble product due to the accumulation of misfolded aggregates, variable activity of enzymes, and (at least one report of) a change in substrate specificity. No study, to the best of our knowledge, has performed a practical comparison of CDSs generated from different codon optimization algorithms or reported the corresponding protein yields. RESULTS: In our efforts to understand what factors constitute an optimized CDS, we identified that there is little consensus among codon-optimization algorithms, a roughly equivalent chance that an algorithm-optimized CDS will increase or diminish recombinant yields as compared to the native DNA, a near ubiquitous use of a codon database that was last updated in 2007, and a high variability of output CDSs by some algorithms. We present a case study, using KRas4B, to demonstrate that a median codon frequency may be a better predictor of soluble yields than the more commonly utilized CAI metric. CONCLUSIONS: We present a method for visualizing, analyzing, and comparing algorithm-optimized DNA sequences for recombinant protein expression. We encourage researchers to consider if DNA optimization is right for their experiments, and work towards improving the reproducibility of published recombinant work by publishing non-native CDSs. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12915-021-00968-8. BioMed Central 2021-02-19 /pmc/articles/PMC7893858/ /pubmed/33607980 http://dx.doi.org/10.1186/s12915-021-00968-8 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Ranaghan, Matthew J.
Li, Jeffrey J.
Laprise, Dylan M.
Garvie, Colin W.
Assessing optimal: inequalities in codon optimization algorithms
title Assessing optimal: inequalities in codon optimization algorithms
title_full Assessing optimal: inequalities in codon optimization algorithms
title_fullStr Assessing optimal: inequalities in codon optimization algorithms
title_full_unstemmed Assessing optimal: inequalities in codon optimization algorithms
title_short Assessing optimal: inequalities in codon optimization algorithms
title_sort assessing optimal: inequalities in codon optimization algorithms
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7893858/
https://www.ncbi.nlm.nih.gov/pubmed/33607980
http://dx.doi.org/10.1186/s12915-021-00968-8
work_keys_str_mv AT ranaghanmatthewj assessingoptimalinequalitiesincodonoptimizationalgorithms
AT lijeffreyj assessingoptimalinequalitiesincodonoptimizationalgorithms
AT laprisedylanm assessingoptimalinequalitiesincodonoptimizationalgorithms
AT garviecolinw assessingoptimalinequalitiesincodonoptimizationalgorithms