Cargando…

An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence Recognition

Enhancers are sequences with short motifs that exhibit high positional variability and free scattering properties. Identification of these noncoding DNA fragments and their strength are extremely important because they play a key role in controlling gene regulation on a cellular basis. The identific...

Descripción completa

Detalles Bibliográficos
Autores principales: Aladhadh, Suliman, Almatroodi, Saleh A., Habib, Shabana, Alabdulatif, Abdulatif, Khattak, Saeed Ullah, Islam, Muhammad
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9855522/
https://www.ncbi.nlm.nih.gov/pubmed/36671456
http://dx.doi.org/10.3390/biom13010070
_version_ 1784873400075812864
author Aladhadh, Suliman
Almatroodi, Saleh A.
Habib, Shabana
Alabdulatif, Abdulatif
Khattak, Saeed Ullah
Islam, Muhammad
author_facet Aladhadh, Suliman
Almatroodi, Saleh A.
Habib, Shabana
Alabdulatif, Abdulatif
Khattak, Saeed Ullah
Islam, Muhammad
author_sort Aladhadh, Suliman
collection PubMed
description Enhancers are sequences with short motifs that exhibit high positional variability and free scattering properties. Identification of these noncoding DNA fragments and their strength are extremely important because they play a key role in controlling gene regulation on a cellular basis. The identification of enhancers is more complex than that of other factors in the genome because they are freely scattered, and their location varies widely. In recent years, bioinformatics tools have enabled significant improvement in identifying this biological difficulty. Cell line-specific screening is not possible using these existing computational methods based solely on DNA sequences. DNA segment chromatin accessibility may provide useful information about its potential function in regulation, thereby identifying regulatory elements based on its chromatin accessibility. In chromatin, the entanglement structure allows positions far apart in the sequence to encounter each other, regardless of their proximity to the gene to be acted upon. Thus, identifying enhancers and assessing their strength is difficult and time-consuming. The goal of our work was to overcome these limitations by presenting a convolutional neural network (CNN) with attention-gated recurrent units (AttGRU) based on Deep Learning. It used a CNN and one-hot coding to build models, primarily to identify enhancers and secondarily to classify their strength. To test the performance of the proposed model, parallels were drawn between enhancer-CNNAttGRU and existing state-of-the-art methods to enable comparisons. The proposed model performed the best for predicting stage one and stage two enhancer sequences, as well as their strengths, in a cross-species analysis, achieving best accuracy values of 87.39% and 84.46%, respectively. Overall, the results showed that the proposed model provided comparable results to state-of-the-art models, highlighting its usefulness.
format Online
Article
Text
id pubmed-9855522
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-98555222023-01-21 An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence Recognition Aladhadh, Suliman Almatroodi, Saleh A. Habib, Shabana Alabdulatif, Abdulatif Khattak, Saeed Ullah Islam, Muhammad Biomolecules Article Enhancers are sequences with short motifs that exhibit high positional variability and free scattering properties. Identification of these noncoding DNA fragments and their strength are extremely important because they play a key role in controlling gene regulation on a cellular basis. The identification of enhancers is more complex than that of other factors in the genome because they are freely scattered, and their location varies widely. In recent years, bioinformatics tools have enabled significant improvement in identifying this biological difficulty. Cell line-specific screening is not possible using these existing computational methods based solely on DNA sequences. DNA segment chromatin accessibility may provide useful information about its potential function in regulation, thereby identifying regulatory elements based on its chromatin accessibility. In chromatin, the entanglement structure allows positions far apart in the sequence to encounter each other, regardless of their proximity to the gene to be acted upon. Thus, identifying enhancers and assessing their strength is difficult and time-consuming. The goal of our work was to overcome these limitations by presenting a convolutional neural network (CNN) with attention-gated recurrent units (AttGRU) based on Deep Learning. It used a CNN and one-hot coding to build models, primarily to identify enhancers and secondarily to classify their strength. To test the performance of the proposed model, parallels were drawn between enhancer-CNNAttGRU and existing state-of-the-art methods to enable comparisons. The proposed model performed the best for predicting stage one and stage two enhancer sequences, as well as their strengths, in a cross-species analysis, achieving best accuracy values of 87.39% and 84.46%, respectively. Overall, the results showed that the proposed model provided comparable results to state-of-the-art models, highlighting its usefulness. MDPI 2022-12-29 /pmc/articles/PMC9855522/ /pubmed/36671456 http://dx.doi.org/10.3390/biom13010070 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Aladhadh, Suliman
Almatroodi, Saleh A.
Habib, Shabana
Alabdulatif, Abdulatif
Khattak, Saeed Ullah
Islam, Muhammad
An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence Recognition
title An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence Recognition
title_full An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence Recognition
title_fullStr An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence Recognition
title_full_unstemmed An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence Recognition
title_short An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence Recognition
title_sort efficient lightweight hybrid model with attention mechanism for enhancer sequence recognition
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9855522/
https://www.ncbi.nlm.nih.gov/pubmed/36671456
http://dx.doi.org/10.3390/biom13010070
work_keys_str_mv AT aladhadhsuliman anefficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT almatroodisaleha anefficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT habibshabana anefficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT alabdulatifabdulatif anefficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT khattaksaeedullah anefficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT islammuhammad anefficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT aladhadhsuliman efficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT almatroodisaleha efficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT habibshabana efficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT alabdulatifabdulatif efficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT khattaksaeedullah efficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT islammuhammad efficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition