Cargando…
Characterizing Promoter and Enhancer Sequences by a Deep Learning Method
Promoters and enhancers are well-known regulatory elements modulating gene expression. As confirmed by high-throughput sequencing technologies, these regulatory elements are bidirectionally transcribed. That is, promoters produce stable mRNA in the sense direction and unstable RNA in the antisense d...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8239401/ https://www.ncbi.nlm.nih.gov/pubmed/34211503 http://dx.doi.org/10.3389/fgene.2021.681259 |
_version_ | 1783715069359554560 |
---|---|
author | Zeng, Xin Park, Sung-Joon Nakai, Kenta |
author_facet | Zeng, Xin Park, Sung-Joon Nakai, Kenta |
author_sort | Zeng, Xin |
collection | PubMed |
description | Promoters and enhancers are well-known regulatory elements modulating gene expression. As confirmed by high-throughput sequencing technologies, these regulatory elements are bidirectionally transcribed. That is, promoters produce stable mRNA in the sense direction and unstable RNA in the antisense direction, while enhancers transcribe unstable RNA in both directions. Although it is thought that enhancers and promoters share a similar architecture of transcription start sites (TSSs), how the transcriptional machinery distinctly uses these genomic regions as promoters or enhancers remains unclear. To address this issue, we developed a deep learning (DL) method by utilizing a convolutional neural network (CNN) and the saliency algorithm. In comparison with other classifiers, our CNN presented higher predictive performance, suggesting the overarching importance of the high-order sequence features, captured by the CNN. Moreover, our method revealed that there are substantial sequence differences between the enhancers and promoters. Remarkably, the 20–120 bp downstream regions from the center of bidirectional TSSs seemed to contribute to the RNA stability. These regions in promoters tend to have a larger number of guanines and cytosines compared to those in enhancers, and this feature contributed to the classification of the regulatory elements. Our CNN-based method can capture the complex TSS architectures. We found that the genomic regions around TSSs for promoters and enhancers contribute to RNA stability and show GC-biased characteristics as a critical determinant for promoter TSSs. |
format | Online Article Text |
id | pubmed-8239401 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-82394012021-06-30 Characterizing Promoter and Enhancer Sequences by a Deep Learning Method Zeng, Xin Park, Sung-Joon Nakai, Kenta Front Genet Genetics Promoters and enhancers are well-known regulatory elements modulating gene expression. As confirmed by high-throughput sequencing technologies, these regulatory elements are bidirectionally transcribed. That is, promoters produce stable mRNA in the sense direction and unstable RNA in the antisense direction, while enhancers transcribe unstable RNA in both directions. Although it is thought that enhancers and promoters share a similar architecture of transcription start sites (TSSs), how the transcriptional machinery distinctly uses these genomic regions as promoters or enhancers remains unclear. To address this issue, we developed a deep learning (DL) method by utilizing a convolutional neural network (CNN) and the saliency algorithm. In comparison with other classifiers, our CNN presented higher predictive performance, suggesting the overarching importance of the high-order sequence features, captured by the CNN. Moreover, our method revealed that there are substantial sequence differences between the enhancers and promoters. Remarkably, the 20–120 bp downstream regions from the center of bidirectional TSSs seemed to contribute to the RNA stability. These regions in promoters tend to have a larger number of guanines and cytosines compared to those in enhancers, and this feature contributed to the classification of the regulatory elements. Our CNN-based method can capture the complex TSS architectures. We found that the genomic regions around TSSs for promoters and enhancers contribute to RNA stability and show GC-biased characteristics as a critical determinant for promoter TSSs. Frontiers Media S.A. 2021-06-15 /pmc/articles/PMC8239401/ /pubmed/34211503 http://dx.doi.org/10.3389/fgene.2021.681259 Text en Copyright © 2021 Zeng, Park and Nakai. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Zeng, Xin Park, Sung-Joon Nakai, Kenta Characterizing Promoter and Enhancer Sequences by a Deep Learning Method |
title | Characterizing Promoter and Enhancer Sequences by a Deep Learning Method |
title_full | Characterizing Promoter and Enhancer Sequences by a Deep Learning Method |
title_fullStr | Characterizing Promoter and Enhancer Sequences by a Deep Learning Method |
title_full_unstemmed | Characterizing Promoter and Enhancer Sequences by a Deep Learning Method |
title_short | Characterizing Promoter and Enhancer Sequences by a Deep Learning Method |
title_sort | characterizing promoter and enhancer sequences by a deep learning method |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8239401/ https://www.ncbi.nlm.nih.gov/pubmed/34211503 http://dx.doi.org/10.3389/fgene.2021.681259 |
work_keys_str_mv | AT zengxin characterizingpromoterandenhancersequencesbyadeeplearningmethod AT parksungjoon characterizingpromoterandenhancersequencesbyadeeplearningmethod AT nakaikenta characterizingpromoterandenhancersequencesbyadeeplearningmethod |