Cargando…

Design Parameters to Control Synthetic Gene Expression in Escherichia coli

BACKGROUND: Production of proteins as therapeutic agents, research reagents and molecular tools frequently depends on expression in heterologous hosts. Synthetic genes are increasingly used for protein production because sequence information is easier to obtain than the corresponding physical DNA. P...

Descripción completa

Detalles Bibliográficos
Autores principales: Welch, Mark, Govindarajan, Sridhar, Ness, Jon E., Villalobos, Alan, Gurney, Austin, Minshull, Jeremy, Gustafsson, Claes
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2736378/
https://www.ncbi.nlm.nih.gov/pubmed/19759823
http://dx.doi.org/10.1371/journal.pone.0007002
_version_ 1782171327972507648
author Welch, Mark
Govindarajan, Sridhar
Ness, Jon E.
Villalobos, Alan
Gurney, Austin
Minshull, Jeremy
Gustafsson, Claes
author_facet Welch, Mark
Govindarajan, Sridhar
Ness, Jon E.
Villalobos, Alan
Gurney, Austin
Minshull, Jeremy
Gustafsson, Claes
author_sort Welch, Mark
collection PubMed
description BACKGROUND: Production of proteins as therapeutic agents, research reagents and molecular tools frequently depends on expression in heterologous hosts. Synthetic genes are increasingly used for protein production because sequence information is easier to obtain than the corresponding physical DNA. Protein-coding sequences are commonly re-designed to enhance expression, but there are no experimentally supported design principles. PRINCIPAL FINDINGS: To identify sequence features that affect protein expression we synthesized and expressed in E. coli two sets of 40 genes encoding two commercially valuable proteins, a DNA polymerase and a single chain antibody. Genes differing only in synonymous codon usage expressed protein at levels ranging from undetectable to 30% of cellular protein. Using partial least squares regression we tested the correlation of protein production levels with parameters that have been reported to affect expression. We found that the amount of protein produced in E. coli was strongly dependent on the codons used to encode a subset of amino acids. Favorable codons were predominantly those read by tRNAs that are most highly charged during amino acid starvation, not codons that are most abundant in highly expressed E. coli proteins. Finally we confirmed the validity of our models by designing, synthesizing and testing new genes using codon biases predicted to perform well. CONCLUSION: The systematic analysis of gene design parameters shown in this study has allowed us to identify codon usage within a gene as a critical determinant of achievable protein expression levels in E. coli. We propose a biochemical basis for this, as well as design algorithms to ensure high protein production from synthetic genes. Replication of this methodology should allow similar design algorithms to be empirically derived for any expression system.
format Text
id pubmed-2736378
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-27363782009-09-17 Design Parameters to Control Synthetic Gene Expression in Escherichia coli Welch, Mark Govindarajan, Sridhar Ness, Jon E. Villalobos, Alan Gurney, Austin Minshull, Jeremy Gustafsson, Claes PLoS One Research Article BACKGROUND: Production of proteins as therapeutic agents, research reagents and molecular tools frequently depends on expression in heterologous hosts. Synthetic genes are increasingly used for protein production because sequence information is easier to obtain than the corresponding physical DNA. Protein-coding sequences are commonly re-designed to enhance expression, but there are no experimentally supported design principles. PRINCIPAL FINDINGS: To identify sequence features that affect protein expression we synthesized and expressed in E. coli two sets of 40 genes encoding two commercially valuable proteins, a DNA polymerase and a single chain antibody. Genes differing only in synonymous codon usage expressed protein at levels ranging from undetectable to 30% of cellular protein. Using partial least squares regression we tested the correlation of protein production levels with parameters that have been reported to affect expression. We found that the amount of protein produced in E. coli was strongly dependent on the codons used to encode a subset of amino acids. Favorable codons were predominantly those read by tRNAs that are most highly charged during amino acid starvation, not codons that are most abundant in highly expressed E. coli proteins. Finally we confirmed the validity of our models by designing, synthesizing and testing new genes using codon biases predicted to perform well. CONCLUSION: The systematic analysis of gene design parameters shown in this study has allowed us to identify codon usage within a gene as a critical determinant of achievable protein expression levels in E. coli. We propose a biochemical basis for this, as well as design algorithms to ensure high protein production from synthetic genes. Replication of this methodology should allow similar design algorithms to be empirically derived for any expression system. Public Library of Science 2009-09-14 /pmc/articles/PMC2736378/ /pubmed/19759823 http://dx.doi.org/10.1371/journal.pone.0007002 Text en Welch et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Welch, Mark
Govindarajan, Sridhar
Ness, Jon E.
Villalobos, Alan
Gurney, Austin
Minshull, Jeremy
Gustafsson, Claes
Design Parameters to Control Synthetic Gene Expression in Escherichia coli
title Design Parameters to Control Synthetic Gene Expression in Escherichia coli
title_full Design Parameters to Control Synthetic Gene Expression in Escherichia coli
title_fullStr Design Parameters to Control Synthetic Gene Expression in Escherichia coli
title_full_unstemmed Design Parameters to Control Synthetic Gene Expression in Escherichia coli
title_short Design Parameters to Control Synthetic Gene Expression in Escherichia coli
title_sort design parameters to control synthetic gene expression in escherichia coli
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2736378/
https://www.ncbi.nlm.nih.gov/pubmed/19759823
http://dx.doi.org/10.1371/journal.pone.0007002
work_keys_str_mv AT welchmark designparameterstocontrolsyntheticgeneexpressioninescherichiacoli
AT govindarajansridhar designparameterstocontrolsyntheticgeneexpressioninescherichiacoli
AT nessjone designparameterstocontrolsyntheticgeneexpressioninescherichiacoli
AT villalobosalan designparameterstocontrolsyntheticgeneexpressioninescherichiacoli
AT gurneyaustin designparameterstocontrolsyntheticgeneexpressioninescherichiacoli
AT minshulljeremy designparameterstocontrolsyntheticgeneexpressioninescherichiacoli
AT gustafssonclaes designparameterstocontrolsyntheticgeneexpressioninescherichiacoli