Cargando…

Design of synthetic promoters for cyanobacteria with generative deep-learning model

Deep generative models, which can approximate complex data distribution from large datasets, are widely used in biological dataset analysis. In particular, they can identify and unravel hidden traits encoded within a complicated nucleotide sequence, allowing us to design genetic parts with accuracy....

Descripción completa

Detalles Bibliográficos
Autores principales: Seo, Euijin, Choi, Yun-Nam, Shin, Ye Rim, Kim, Donghyuk, Lee, Jeong Wook
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10359606/
https://www.ncbi.nlm.nih.gov/pubmed/37246641
http://dx.doi.org/10.1093/nar/gkad451
_version_ 1785075921808523264
author Seo, Euijin
Choi, Yun-Nam
Shin, Ye Rim
Kim, Donghyuk
Lee, Jeong Wook
author_facet Seo, Euijin
Choi, Yun-Nam
Shin, Ye Rim
Kim, Donghyuk
Lee, Jeong Wook
author_sort Seo, Euijin
collection PubMed
description Deep generative models, which can approximate complex data distribution from large datasets, are widely used in biological dataset analysis. In particular, they can identify and unravel hidden traits encoded within a complicated nucleotide sequence, allowing us to design genetic parts with accuracy. Here, we provide a deep-learning based generic framework to design and evaluate synthetic promoters for cyanobacteria using generative models, which was in turn validated with cell-free transcription assay. We developed a deep generative model and a predictive model using a variational autoencoder and convolutional neural network, respectively. Using native promoter sequences of the model unicellular cyanobacterium Synechocystis sp. PCC 6803 as a training dataset, we generated 10 000 synthetic promoter sequences and predicted their strengths. By position weight matrix and k-mer analyses, we confirmed that our model captured a valid feature of cyanobacteria promoters from the dataset. Furthermore, critical subregion identification analysis consistently revealed the importance of the -10 box sequence motif in cyanobacteria promoters. Moreover, we validated that the generated promoter sequence can efficiently drive transcription via cell-free transcription assay. This approach, combining in silico and in vitro studies, will provide a foundation for the rapid design and validation of synthetic promoters, especially for non-model organisms.
format Online
Article
Text
id pubmed-10359606
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-103596062023-07-22 Design of synthetic promoters for cyanobacteria with generative deep-learning model Seo, Euijin Choi, Yun-Nam Shin, Ye Rim Kim, Donghyuk Lee, Jeong Wook Nucleic Acids Res Synthetic Biology and Bioengineering Deep generative models, which can approximate complex data distribution from large datasets, are widely used in biological dataset analysis. In particular, they can identify and unravel hidden traits encoded within a complicated nucleotide sequence, allowing us to design genetic parts with accuracy. Here, we provide a deep-learning based generic framework to design and evaluate synthetic promoters for cyanobacteria using generative models, which was in turn validated with cell-free transcription assay. We developed a deep generative model and a predictive model using a variational autoencoder and convolutional neural network, respectively. Using native promoter sequences of the model unicellular cyanobacterium Synechocystis sp. PCC 6803 as a training dataset, we generated 10 000 synthetic promoter sequences and predicted their strengths. By position weight matrix and k-mer analyses, we confirmed that our model captured a valid feature of cyanobacteria promoters from the dataset. Furthermore, critical subregion identification analysis consistently revealed the importance of the -10 box sequence motif in cyanobacteria promoters. Moreover, we validated that the generated promoter sequence can efficiently drive transcription via cell-free transcription assay. This approach, combining in silico and in vitro studies, will provide a foundation for the rapid design and validation of synthetic promoters, especially for non-model organisms. Oxford University Press 2023-05-29 /pmc/articles/PMC10359606/ /pubmed/37246641 http://dx.doi.org/10.1093/nar/gkad451 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Synthetic Biology and Bioengineering
Seo, Euijin
Choi, Yun-Nam
Shin, Ye Rim
Kim, Donghyuk
Lee, Jeong Wook
Design of synthetic promoters for cyanobacteria with generative deep-learning model
title Design of synthetic promoters for cyanobacteria with generative deep-learning model
title_full Design of synthetic promoters for cyanobacteria with generative deep-learning model
title_fullStr Design of synthetic promoters for cyanobacteria with generative deep-learning model
title_full_unstemmed Design of synthetic promoters for cyanobacteria with generative deep-learning model
title_short Design of synthetic promoters for cyanobacteria with generative deep-learning model
title_sort design of synthetic promoters for cyanobacteria with generative deep-learning model
topic Synthetic Biology and Bioengineering
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10359606/
https://www.ncbi.nlm.nih.gov/pubmed/37246641
http://dx.doi.org/10.1093/nar/gkad451
work_keys_str_mv AT seoeuijin designofsyntheticpromotersforcyanobacteriawithgenerativedeeplearningmodel
AT choiyunnam designofsyntheticpromotersforcyanobacteriawithgenerativedeeplearningmodel
AT shinyerim designofsyntheticpromotersforcyanobacteriawithgenerativedeeplearningmodel
AT kimdonghyuk designofsyntheticpromotersforcyanobacteriawithgenerativedeeplearningmodel
AT leejeongwook designofsyntheticpromotersforcyanobacteriawithgenerativedeeplearningmodel