Cargando…

A pan-genome method to determine core regions of the Bacillus subtilis and Escherichia coli genomes

Background: Synthetic engineering of bacteria to produce industrial products is a burgeoning field of research and application. In order to optimize genome design, designers need to understand which genes are essential, which are optimal for growth, and locations in the genome that will be tolerated...

Descripción completa

Detalles Bibliográficos
Autores principales: Sutton, Granger, Fogel, Gary B., Abramson, Bradley, Brinkac, Lauren, Michael, Todd, Liu, Enoch S., Thomas, Sterling
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000 Research Limited 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8156514/
https://www.ncbi.nlm.nih.gov/pubmed/34113437
http://dx.doi.org/10.12688/f1000research.51873.2
_version_ 1783699462974078976
author Sutton, Granger
Fogel, Gary B.
Abramson, Bradley
Brinkac, Lauren
Michael, Todd
Liu, Enoch S.
Thomas, Sterling
author_facet Sutton, Granger
Fogel, Gary B.
Abramson, Bradley
Brinkac, Lauren
Michael, Todd
Liu, Enoch S.
Thomas, Sterling
author_sort Sutton, Granger
collection PubMed
description Background: Synthetic engineering of bacteria to produce industrial products is a burgeoning field of research and application. In order to optimize genome design, designers need to understand which genes are essential, which are optimal for growth, and locations in the genome that will be tolerated by the organism when inserting engineered cassettes. Methods: We present a pan-genome based method for the identification of core regions in a genome that are strongly conserved at the species level. Results: We show that the core regions determined by our method contain all or almost all essential genes. This demonstrates the accuracy of our method as essential genes should be core genes. We show that we outperform previous methods by this measure. We also explain why there are exceptions to this rule for our method. Conclusions: We assert that synthetic engineers should avoid deleting or inserting into these core regions unless they understand and are manipulating the function of the genes in that region. Similarly, if the designer wishes to streamline the genome, non-core regions and in particular low penetrance genes would be good targets for deletion. Care should be taken to remove entire cassettes with similar penetrance of the genes within cassettes as they may harbor toxin/antitoxin genes which need to be removed in tandem. The bioinformatic approach introduced here saves considerable time and effort relative to knockout studies on single isolates of a given species and captures a broad understanding of the conservation of genes that are core to a species.
format Online
Article
Text
id pubmed-8156514
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher F1000 Research Limited
record_format MEDLINE/PubMed
spelling pubmed-81565142021-06-09 A pan-genome method to determine core regions of the Bacillus subtilis and Escherichia coli genomes Sutton, Granger Fogel, Gary B. Abramson, Bradley Brinkac, Lauren Michael, Todd Liu, Enoch S. Thomas, Sterling F1000Res Research Article Background: Synthetic engineering of bacteria to produce industrial products is a burgeoning field of research and application. In order to optimize genome design, designers need to understand which genes are essential, which are optimal for growth, and locations in the genome that will be tolerated by the organism when inserting engineered cassettes. Methods: We present a pan-genome based method for the identification of core regions in a genome that are strongly conserved at the species level. Results: We show that the core regions determined by our method contain all or almost all essential genes. This demonstrates the accuracy of our method as essential genes should be core genes. We show that we outperform previous methods by this measure. We also explain why there are exceptions to this rule for our method. Conclusions: We assert that synthetic engineers should avoid deleting or inserting into these core regions unless they understand and are manipulating the function of the genes in that region. Similarly, if the designer wishes to streamline the genome, non-core regions and in particular low penetrance genes would be good targets for deletion. Care should be taken to remove entire cassettes with similar penetrance of the genes within cassettes as they may harbor toxin/antitoxin genes which need to be removed in tandem. The bioinformatic approach introduced here saves considerable time and effort relative to knockout studies on single isolates of a given species and captures a broad understanding of the conservation of genes that are core to a species. F1000 Research Limited 2021-09-02 /pmc/articles/PMC8156514/ /pubmed/34113437 http://dx.doi.org/10.12688/f1000research.51873.2 Text en Copyright: © 2021 Sutton G et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Sutton, Granger
Fogel, Gary B.
Abramson, Bradley
Brinkac, Lauren
Michael, Todd
Liu, Enoch S.
Thomas, Sterling
A pan-genome method to determine core regions of the Bacillus subtilis and Escherichia coli genomes
title A pan-genome method to determine core regions of the Bacillus subtilis and Escherichia coli genomes
title_full A pan-genome method to determine core regions of the Bacillus subtilis and Escherichia coli genomes
title_fullStr A pan-genome method to determine core regions of the Bacillus subtilis and Escherichia coli genomes
title_full_unstemmed A pan-genome method to determine core regions of the Bacillus subtilis and Escherichia coli genomes
title_short A pan-genome method to determine core regions of the Bacillus subtilis and Escherichia coli genomes
title_sort pan-genome method to determine core regions of the bacillus subtilis and escherichia coli genomes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8156514/
https://www.ncbi.nlm.nih.gov/pubmed/34113437
http://dx.doi.org/10.12688/f1000research.51873.2
work_keys_str_mv AT suttongranger apangenomemethodtodeterminecoreregionsofthebacillussubtilisandescherichiacoligenomes
AT fogelgaryb apangenomemethodtodeterminecoreregionsofthebacillussubtilisandescherichiacoligenomes
AT abramsonbradley apangenomemethodtodeterminecoreregionsofthebacillussubtilisandescherichiacoligenomes
AT brinkaclauren apangenomemethodtodeterminecoreregionsofthebacillussubtilisandescherichiacoligenomes
AT michaeltodd apangenomemethodtodeterminecoreregionsofthebacillussubtilisandescherichiacoligenomes
AT liuenochs apangenomemethodtodeterminecoreregionsofthebacillussubtilisandescherichiacoligenomes
AT thomassterling apangenomemethodtodeterminecoreregionsofthebacillussubtilisandescherichiacoligenomes
AT suttongranger pangenomemethodtodeterminecoreregionsofthebacillussubtilisandescherichiacoligenomes
AT fogelgaryb pangenomemethodtodeterminecoreregionsofthebacillussubtilisandescherichiacoligenomes
AT abramsonbradley pangenomemethodtodeterminecoreregionsofthebacillussubtilisandescherichiacoligenomes
AT brinkaclauren pangenomemethodtodeterminecoreregionsofthebacillussubtilisandescherichiacoligenomes
AT michaeltodd pangenomemethodtodeterminecoreregionsofthebacillussubtilisandescherichiacoligenomes
AT liuenochs pangenomemethodtodeterminecoreregionsofthebacillussubtilisandescherichiacoligenomes
AT thomassterling pangenomemethodtodeterminecoreregionsofthebacillussubtilisandescherichiacoligenomes