Cargando…

iEnhancer-EBLSTM: Identifying Enhancers and Strengths by Ensembles of Bidirectional Long Short-Term Memory

Enhancers are regulatory DNA sequences that could be bound by specific proteins named transcription factors (TFs). The interactions between enhancers and TFs regulate specific genes by increasing the target gene expression. Therefore, enhancer identification and classification have been a critical i...

Descripción completa

Detalles Bibliográficos
Autores principales: Niu, Kun, Luo, Ximei, Zhang, Shumei, Teng, Zhixia, Zhang, Tianjiao, Zhao, Yuming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8021722/
https://www.ncbi.nlm.nih.gov/pubmed/33833783
http://dx.doi.org/10.3389/fgene.2021.665498
_version_ 1783674792686125056
author Niu, Kun
Luo, Ximei
Zhang, Shumei
Teng, Zhixia
Zhang, Tianjiao
Zhao, Yuming
author_facet Niu, Kun
Luo, Ximei
Zhang, Shumei
Teng, Zhixia
Zhang, Tianjiao
Zhao, Yuming
author_sort Niu, Kun
collection PubMed
description Enhancers are regulatory DNA sequences that could be bound by specific proteins named transcription factors (TFs). The interactions between enhancers and TFs regulate specific genes by increasing the target gene expression. Therefore, enhancer identification and classification have been a critical issue in the enhancer field. Unfortunately, so far there has been a lack of suitable methods to identify enhancers. Previous research has mainly focused on the features of the enhancer’s function and interactions, which ignores the sequence information. As we know, the recurrent neural network (RNN) and long short-term memory (LSTM) models are currently the most common methods for processing time series data. LSTM is more suitable than RNN to address the DNA sequence. In this paper, we take the advantages of LSTM to build a method named iEnhancer-EBLSTM to identify enhancers. iEnhancer-ensembles of bidirectional LSTM (EBLSTM) consists of two steps. In the first step, we extract subsequences by sliding a 3-mer window along the DNA sequence as features. Second, EBLSTM model is used to identify enhancers from the candidate input sequences. We use the dataset from the study of Quang H et al. as the benchmarks. The experimental results from the datasets demonstrate the efficiency of our proposed model.
format Online
Article
Text
id pubmed-8021722
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-80217222021-04-07 iEnhancer-EBLSTM: Identifying Enhancers and Strengths by Ensembles of Bidirectional Long Short-Term Memory Niu, Kun Luo, Ximei Zhang, Shumei Teng, Zhixia Zhang, Tianjiao Zhao, Yuming Front Genet Genetics Enhancers are regulatory DNA sequences that could be bound by specific proteins named transcription factors (TFs). The interactions between enhancers and TFs regulate specific genes by increasing the target gene expression. Therefore, enhancer identification and classification have been a critical issue in the enhancer field. Unfortunately, so far there has been a lack of suitable methods to identify enhancers. Previous research has mainly focused on the features of the enhancer’s function and interactions, which ignores the sequence information. As we know, the recurrent neural network (RNN) and long short-term memory (LSTM) models are currently the most common methods for processing time series data. LSTM is more suitable than RNN to address the DNA sequence. In this paper, we take the advantages of LSTM to build a method named iEnhancer-EBLSTM to identify enhancers. iEnhancer-ensembles of bidirectional LSTM (EBLSTM) consists of two steps. In the first step, we extract subsequences by sliding a 3-mer window along the DNA sequence as features. Second, EBLSTM model is used to identify enhancers from the candidate input sequences. We use the dataset from the study of Quang H et al. as the benchmarks. The experimental results from the datasets demonstrate the efficiency of our proposed model. Frontiers Media S.A. 2021-03-23 /pmc/articles/PMC8021722/ /pubmed/33833783 http://dx.doi.org/10.3389/fgene.2021.665498 Text en Copyright © 2021 Niu, Luo, Zhang, Teng, Zhang and Zhao. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Niu, Kun
Luo, Ximei
Zhang, Shumei
Teng, Zhixia
Zhang, Tianjiao
Zhao, Yuming
iEnhancer-EBLSTM: Identifying Enhancers and Strengths by Ensembles of Bidirectional Long Short-Term Memory
title iEnhancer-EBLSTM: Identifying Enhancers and Strengths by Ensembles of Bidirectional Long Short-Term Memory
title_full iEnhancer-EBLSTM: Identifying Enhancers and Strengths by Ensembles of Bidirectional Long Short-Term Memory
title_fullStr iEnhancer-EBLSTM: Identifying Enhancers and Strengths by Ensembles of Bidirectional Long Short-Term Memory
title_full_unstemmed iEnhancer-EBLSTM: Identifying Enhancers and Strengths by Ensembles of Bidirectional Long Short-Term Memory
title_short iEnhancer-EBLSTM: Identifying Enhancers and Strengths by Ensembles of Bidirectional Long Short-Term Memory
title_sort ienhancer-eblstm: identifying enhancers and strengths by ensembles of bidirectional long short-term memory
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8021722/
https://www.ncbi.nlm.nih.gov/pubmed/33833783
http://dx.doi.org/10.3389/fgene.2021.665498
work_keys_str_mv AT niukun ienhancereblstmidentifyingenhancersandstrengthsbyensemblesofbidirectionallongshorttermmemory
AT luoximei ienhancereblstmidentifyingenhancersandstrengthsbyensemblesofbidirectionallongshorttermmemory
AT zhangshumei ienhancereblstmidentifyingenhancersandstrengthsbyensemblesofbidirectionallongshorttermmemory
AT tengzhixia ienhancereblstmidentifyingenhancersandstrengthsbyensemblesofbidirectionallongshorttermmemory
AT zhangtianjiao ienhancereblstmidentifyingenhancersandstrengthsbyensemblesofbidirectionallongshorttermmemory
AT zhaoyuming ienhancereblstmidentifyingenhancersandstrengthsbyensemblesofbidirectionallongshorttermmemory