Cargando…
Codon optimization with deep learning to enhance protein expression
Heterologous expression is the main approach for recombinant protein production ingenetic synthesis, for which codon optimization is necessary. The existing optimization methods are based on biological indexes. In this paper, we propose a novel codon optimization method based on deep learning. First...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7572362/ https://www.ncbi.nlm.nih.gov/pubmed/33077783 http://dx.doi.org/10.1038/s41598-020-74091-z |
_version_ | 1783597325516537856 |
---|---|
author | Fu, Hongguang Liang, Yanbing Zhong, Xiuqin Pan, ZhiLing Huang, Lei Zhang, HaiLin Xu, Yang Zhou, Wei Liu, Zhong |
author_facet | Fu, Hongguang Liang, Yanbing Zhong, Xiuqin Pan, ZhiLing Huang, Lei Zhang, HaiLin Xu, Yang Zhou, Wei Liu, Zhong |
author_sort | Fu, Hongguang |
collection | PubMed |
description | Heterologous expression is the main approach for recombinant protein production ingenetic synthesis, for which codon optimization is necessary. The existing optimization methods are based on biological indexes. In this paper, we propose a novel codon optimization method based on deep learning. First, we introduce the concept of codon boxes, via which DNA sequences can be recoded into codon box sequences while ignoring the order of bases. Then, the problem of codon optimization can be converted to sequence annotation of corresponding amino acids with codon boxes. The codon optimization models for Escherichia Coli were trained by the Bidirectional Long-Short-Term Memory Conditional Random Field. Theoretically, deep learning is a good method to obtain the distribution characteristics of DNA. In addition to the comparison of the codon adaptation index, protein expression experiments for plasmodium falciparum candidate vaccine and polymerase acidic protein were implemented for comparison with the original sequences and the optimized sequences from Genewiz and ThermoFisher. The results show that our method for enhancing protein expression is efficient and competitive. |
format | Online Article Text |
id | pubmed-7572362 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-75723622020-10-21 Codon optimization with deep learning to enhance protein expression Fu, Hongguang Liang, Yanbing Zhong, Xiuqin Pan, ZhiLing Huang, Lei Zhang, HaiLin Xu, Yang Zhou, Wei Liu, Zhong Sci Rep Article Heterologous expression is the main approach for recombinant protein production ingenetic synthesis, for which codon optimization is necessary. The existing optimization methods are based on biological indexes. In this paper, we propose a novel codon optimization method based on deep learning. First, we introduce the concept of codon boxes, via which DNA sequences can be recoded into codon box sequences while ignoring the order of bases. Then, the problem of codon optimization can be converted to sequence annotation of corresponding amino acids with codon boxes. The codon optimization models for Escherichia Coli were trained by the Bidirectional Long-Short-Term Memory Conditional Random Field. Theoretically, deep learning is a good method to obtain the distribution characteristics of DNA. In addition to the comparison of the codon adaptation index, protein expression experiments for plasmodium falciparum candidate vaccine and polymerase acidic protein were implemented for comparison with the original sequences and the optimized sequences from Genewiz and ThermoFisher. The results show that our method for enhancing protein expression is efficient and competitive. Nature Publishing Group UK 2020-10-19 /pmc/articles/PMC7572362/ /pubmed/33077783 http://dx.doi.org/10.1038/s41598-020-74091-z Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Fu, Hongguang Liang, Yanbing Zhong, Xiuqin Pan, ZhiLing Huang, Lei Zhang, HaiLin Xu, Yang Zhou, Wei Liu, Zhong Codon optimization with deep learning to enhance protein expression |
title | Codon optimization with deep learning to enhance protein expression |
title_full | Codon optimization with deep learning to enhance protein expression |
title_fullStr | Codon optimization with deep learning to enhance protein expression |
title_full_unstemmed | Codon optimization with deep learning to enhance protein expression |
title_short | Codon optimization with deep learning to enhance protein expression |
title_sort | codon optimization with deep learning to enhance protein expression |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7572362/ https://www.ncbi.nlm.nih.gov/pubmed/33077783 http://dx.doi.org/10.1038/s41598-020-74091-z |
work_keys_str_mv | AT fuhongguang codonoptimizationwithdeeplearningtoenhanceproteinexpression AT liangyanbing codonoptimizationwithdeeplearningtoenhanceproteinexpression AT zhongxiuqin codonoptimizationwithdeeplearningtoenhanceproteinexpression AT panzhiling codonoptimizationwithdeeplearningtoenhanceproteinexpression AT huanglei codonoptimizationwithdeeplearningtoenhanceproteinexpression AT zhanghailin codonoptimizationwithdeeplearningtoenhanceproteinexpression AT xuyang codonoptimizationwithdeeplearningtoenhanceproteinexpression AT zhouwei codonoptimizationwithdeeplearningtoenhanceproteinexpression AT liuzhong codonoptimizationwithdeeplearningtoenhanceproteinexpression |