Cargando…

Genome re-sequencing and reannotation of the Escherichia coli ER2566 strain and transcriptome sequencing under overexpression conditions

BACKGROUND: The Escherichia coli ER2566 strain (NC_CP014268.2) was developed as a BL21 (DE3) derivative strain and had been widely used in recombinant protein expression. However, like many other current RefSeq annotations, the annotation of the ER2566 strain was incomplete, with missing gene names...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, Lizhi, Yu, Hai, Wang, Kaihang, Chen, Tingting, Ma, Yue, Huang, Yang, Li, Jiajia, Liu, Liqin, Li, Yuqian, Kong, Zhibo, Zheng, Qingbing, Wang, Yingbin, Gu, Ying, Xia, Ningshao, Li, Shaowei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7296898/
https://www.ncbi.nlm.nih.gov/pubmed/32546194
http://dx.doi.org/10.1186/s12864-020-06818-1
_version_ 1783546919760429056
author Zhou, Lizhi
Yu, Hai
Wang, Kaihang
Chen, Tingting
Ma, Yue
Huang, Yang
Li, Jiajia
Liu, Liqin
Li, Yuqian
Kong, Zhibo
Zheng, Qingbing
Wang, Yingbin
Gu, Ying
Xia, Ningshao
Li, Shaowei
author_facet Zhou, Lizhi
Yu, Hai
Wang, Kaihang
Chen, Tingting
Ma, Yue
Huang, Yang
Li, Jiajia
Liu, Liqin
Li, Yuqian
Kong, Zhibo
Zheng, Qingbing
Wang, Yingbin
Gu, Ying
Xia, Ningshao
Li, Shaowei
author_sort Zhou, Lizhi
collection PubMed
description BACKGROUND: The Escherichia coli ER2566 strain (NC_CP014268.2) was developed as a BL21 (DE3) derivative strain and had been widely used in recombinant protein expression. However, like many other current RefSeq annotations, the annotation of the ER2566 strain was incomplete, with missing gene names and miscellaneous RNAs, as well as uncorrected annotations of some pseudogenes. Here, we performed a systematic reannotation of the ER2566 genome by combining multiple annotation tools with manual revision to provide a comprehensive understanding of the E. coli ER2566 strain, and used high-throughput sequencing to explore how the strain adapted under external pressure. RESULTS: The reannotation included noteworthy corrections to all protein-coding genes, led to the exclusion of 190 hypothetical genes or pseudogenes, and resulted in the addition of 237 coding sequences and 230 miscellaneous noncoding RNAs and 2 tRNAs. In addition, we further manually examined all 194 pseudogenes in the Ref-seq annotation and directly identified 123 (63%) as coding genes. We then used whole-genome sequencing and high-throughput RNA sequencing to assess mutational adaptations under consecutive subculture or overexpression burden. Whereas no mutations were detected in response to consecutive subculture, overexpression of the human papillomavirus 16 type capsid led to the identification of a mutation (position 1,094,824 within the 3′ non-coding region) positioned 19-bp away from the lacI gene in the transcribed RNA, which was not detected at the genomic level by Sanger sequencing. CONCLUSION: The ER2566 strain was used by both the general scientific community and the biotechnology industry. Reannotation of the E. coli ER2566 strain not only improved the RefSeq data but uncovered a key site that might be involved in the transcription and translation of genes encoding the lactose operon repressor. We proposed that our pipeline might offer a universal method for the reannotation of other bacterial genomes with high speed and accuracy. This study might facilitate a better understanding of gene function for the ER2566 strain under external burden and provided more clues to engineer bacteria for biotechnological applications.
format Online
Article
Text
id pubmed-7296898
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-72968982020-06-16 Genome re-sequencing and reannotation of the Escherichia coli ER2566 strain and transcriptome sequencing under overexpression conditions Zhou, Lizhi Yu, Hai Wang, Kaihang Chen, Tingting Ma, Yue Huang, Yang Li, Jiajia Liu, Liqin Li, Yuqian Kong, Zhibo Zheng, Qingbing Wang, Yingbin Gu, Ying Xia, Ningshao Li, Shaowei BMC Genomics Research Article BACKGROUND: The Escherichia coli ER2566 strain (NC_CP014268.2) was developed as a BL21 (DE3) derivative strain and had been widely used in recombinant protein expression. However, like many other current RefSeq annotations, the annotation of the ER2566 strain was incomplete, with missing gene names and miscellaneous RNAs, as well as uncorrected annotations of some pseudogenes. Here, we performed a systematic reannotation of the ER2566 genome by combining multiple annotation tools with manual revision to provide a comprehensive understanding of the E. coli ER2566 strain, and used high-throughput sequencing to explore how the strain adapted under external pressure. RESULTS: The reannotation included noteworthy corrections to all protein-coding genes, led to the exclusion of 190 hypothetical genes or pseudogenes, and resulted in the addition of 237 coding sequences and 230 miscellaneous noncoding RNAs and 2 tRNAs. In addition, we further manually examined all 194 pseudogenes in the Ref-seq annotation and directly identified 123 (63%) as coding genes. We then used whole-genome sequencing and high-throughput RNA sequencing to assess mutational adaptations under consecutive subculture or overexpression burden. Whereas no mutations were detected in response to consecutive subculture, overexpression of the human papillomavirus 16 type capsid led to the identification of a mutation (position 1,094,824 within the 3′ non-coding region) positioned 19-bp away from the lacI gene in the transcribed RNA, which was not detected at the genomic level by Sanger sequencing. CONCLUSION: The ER2566 strain was used by both the general scientific community and the biotechnology industry. Reannotation of the E. coli ER2566 strain not only improved the RefSeq data but uncovered a key site that might be involved in the transcription and translation of genes encoding the lactose operon repressor. We proposed that our pipeline might offer a universal method for the reannotation of other bacterial genomes with high speed and accuracy. This study might facilitate a better understanding of gene function for the ER2566 strain under external burden and provided more clues to engineer bacteria for biotechnological applications. BioMed Central 2020-06-16 /pmc/articles/PMC7296898/ /pubmed/32546194 http://dx.doi.org/10.1186/s12864-020-06818-1 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Zhou, Lizhi
Yu, Hai
Wang, Kaihang
Chen, Tingting
Ma, Yue
Huang, Yang
Li, Jiajia
Liu, Liqin
Li, Yuqian
Kong, Zhibo
Zheng, Qingbing
Wang, Yingbin
Gu, Ying
Xia, Ningshao
Li, Shaowei
Genome re-sequencing and reannotation of the Escherichia coli ER2566 strain and transcriptome sequencing under overexpression conditions
title Genome re-sequencing and reannotation of the Escherichia coli ER2566 strain and transcriptome sequencing under overexpression conditions
title_full Genome re-sequencing and reannotation of the Escherichia coli ER2566 strain and transcriptome sequencing under overexpression conditions
title_fullStr Genome re-sequencing and reannotation of the Escherichia coli ER2566 strain and transcriptome sequencing under overexpression conditions
title_full_unstemmed Genome re-sequencing and reannotation of the Escherichia coli ER2566 strain and transcriptome sequencing under overexpression conditions
title_short Genome re-sequencing and reannotation of the Escherichia coli ER2566 strain and transcriptome sequencing under overexpression conditions
title_sort genome re-sequencing and reannotation of the escherichia coli er2566 strain and transcriptome sequencing under overexpression conditions
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7296898/
https://www.ncbi.nlm.nih.gov/pubmed/32546194
http://dx.doi.org/10.1186/s12864-020-06818-1
work_keys_str_mv AT zhoulizhi genomeresequencingandreannotationoftheescherichiacolier2566strainandtranscriptomesequencingunderoverexpressionconditions
AT yuhai genomeresequencingandreannotationoftheescherichiacolier2566strainandtranscriptomesequencingunderoverexpressionconditions
AT wangkaihang genomeresequencingandreannotationoftheescherichiacolier2566strainandtranscriptomesequencingunderoverexpressionconditions
AT chentingting genomeresequencingandreannotationoftheescherichiacolier2566strainandtranscriptomesequencingunderoverexpressionconditions
AT mayue genomeresequencingandreannotationoftheescherichiacolier2566strainandtranscriptomesequencingunderoverexpressionconditions
AT huangyang genomeresequencingandreannotationoftheescherichiacolier2566strainandtranscriptomesequencingunderoverexpressionconditions
AT lijiajia genomeresequencingandreannotationoftheescherichiacolier2566strainandtranscriptomesequencingunderoverexpressionconditions
AT liuliqin genomeresequencingandreannotationoftheescherichiacolier2566strainandtranscriptomesequencingunderoverexpressionconditions
AT liyuqian genomeresequencingandreannotationoftheescherichiacolier2566strainandtranscriptomesequencingunderoverexpressionconditions
AT kongzhibo genomeresequencingandreannotationoftheescherichiacolier2566strainandtranscriptomesequencingunderoverexpressionconditions
AT zhengqingbing genomeresequencingandreannotationoftheescherichiacolier2566strainandtranscriptomesequencingunderoverexpressionconditions
AT wangyingbin genomeresequencingandreannotationoftheescherichiacolier2566strainandtranscriptomesequencingunderoverexpressionconditions
AT guying genomeresequencingandreannotationoftheescherichiacolier2566strainandtranscriptomesequencingunderoverexpressionconditions
AT xianingshao genomeresequencingandreannotationoftheescherichiacolier2566strainandtranscriptomesequencingunderoverexpressionconditions
AT lishaowei genomeresequencingandreannotationoftheescherichiacolier2566strainandtranscriptomesequencingunderoverexpressionconditions