Cargando…

Codon optimization with deep learning to enhance protein expression

Heterologous expression is the main approach for recombinant protein production ingenetic synthesis, for which codon optimization is necessary. The existing optimization methods are based on biological indexes. In this paper, we propose a novel codon optimization method based on deep learning. First...

Descripción completa

Detalles Bibliográficos
Autores principales: Fu, Hongguang, Liang, Yanbing, Zhong, Xiuqin, Pan, ZhiLing, Huang, Lei, Zhang, HaiLin, Xu, Yang, Zhou, Wei, Liu, Zhong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7572362/
https://www.ncbi.nlm.nih.gov/pubmed/33077783
http://dx.doi.org/10.1038/s41598-020-74091-z
_version_ 1783597325516537856
author Fu, Hongguang
Liang, Yanbing
Zhong, Xiuqin
Pan, ZhiLing
Huang, Lei
Zhang, HaiLin
Xu, Yang
Zhou, Wei
Liu, Zhong
author_facet Fu, Hongguang
Liang, Yanbing
Zhong, Xiuqin
Pan, ZhiLing
Huang, Lei
Zhang, HaiLin
Xu, Yang
Zhou, Wei
Liu, Zhong
author_sort Fu, Hongguang
collection PubMed
description Heterologous expression is the main approach for recombinant protein production ingenetic synthesis, for which codon optimization is necessary. The existing optimization methods are based on biological indexes. In this paper, we propose a novel codon optimization method based on deep learning. First, we introduce the concept of codon boxes, via which DNA sequences can be recoded into codon box sequences while ignoring the order of bases. Then, the problem of codon optimization can be converted to sequence annotation of corresponding amino acids with codon boxes. The codon optimization models for Escherichia Coli were trained by the Bidirectional Long-Short-Term Memory Conditional Random Field. Theoretically, deep learning is a good method to obtain the distribution characteristics of DNA. In addition to the comparison of the codon adaptation index, protein expression experiments for plasmodium falciparum candidate vaccine and polymerase acidic protein were implemented for comparison with the original sequences and the optimized sequences from Genewiz and ThermoFisher. The results show that our method for enhancing protein expression is efficient and competitive.
format Online
Article
Text
id pubmed-7572362
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-75723622020-10-21 Codon optimization with deep learning to enhance protein expression Fu, Hongguang Liang, Yanbing Zhong, Xiuqin Pan, ZhiLing Huang, Lei Zhang, HaiLin Xu, Yang Zhou, Wei Liu, Zhong Sci Rep Article Heterologous expression is the main approach for recombinant protein production ingenetic synthesis, for which codon optimization is necessary. The existing optimization methods are based on biological indexes. In this paper, we propose a novel codon optimization method based on deep learning. First, we introduce the concept of codon boxes, via which DNA sequences can be recoded into codon box sequences while ignoring the order of bases. Then, the problem of codon optimization can be converted to sequence annotation of corresponding amino acids with codon boxes. The codon optimization models for Escherichia Coli were trained by the Bidirectional Long-Short-Term Memory Conditional Random Field. Theoretically, deep learning is a good method to obtain the distribution characteristics of DNA. In addition to the comparison of the codon adaptation index, protein expression experiments for plasmodium falciparum candidate vaccine and polymerase acidic protein were implemented for comparison with the original sequences and the optimized sequences from Genewiz and ThermoFisher. The results show that our method for enhancing protein expression is efficient and competitive. Nature Publishing Group UK 2020-10-19 /pmc/articles/PMC7572362/ /pubmed/33077783 http://dx.doi.org/10.1038/s41598-020-74091-z Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Fu, Hongguang
Liang, Yanbing
Zhong, Xiuqin
Pan, ZhiLing
Huang, Lei
Zhang, HaiLin
Xu, Yang
Zhou, Wei
Liu, Zhong
Codon optimization with deep learning to enhance protein expression
title Codon optimization with deep learning to enhance protein expression
title_full Codon optimization with deep learning to enhance protein expression
title_fullStr Codon optimization with deep learning to enhance protein expression
title_full_unstemmed Codon optimization with deep learning to enhance protein expression
title_short Codon optimization with deep learning to enhance protein expression
title_sort codon optimization with deep learning to enhance protein expression
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7572362/
https://www.ncbi.nlm.nih.gov/pubmed/33077783
http://dx.doi.org/10.1038/s41598-020-74091-z
work_keys_str_mv AT fuhongguang codonoptimizationwithdeeplearningtoenhanceproteinexpression
AT liangyanbing codonoptimizationwithdeeplearningtoenhanceproteinexpression
AT zhongxiuqin codonoptimizationwithdeeplearningtoenhanceproteinexpression
AT panzhiling codonoptimizationwithdeeplearningtoenhanceproteinexpression
AT huanglei codonoptimizationwithdeeplearningtoenhanceproteinexpression
AT zhanghailin codonoptimizationwithdeeplearningtoenhanceproteinexpression
AT xuyang codonoptimizationwithdeeplearningtoenhanceproteinexpression
AT zhouwei codonoptimizationwithdeeplearningtoenhanceproteinexpression
AT liuzhong codonoptimizationwithdeeplearningtoenhanceproteinexpression