Cargando…

Identifying and Classifying Enhancers by Dinucleotide-Based Auto-Cross Covariance and Attention-Based Bi-LSTM

Enhancers are a class of noncoding DNA elements located near structural genes. In recent years, their identification and classification have been the focus of research in the field of bioinformatics. However, due to their high free scattering and position variability, although the performance of the...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Shulin, Pan, Qingfeng, Zou, Quan, Ju, Ying, Shi, Lei, Su, Xi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9005296/
https://www.ncbi.nlm.nih.gov/pubmed/35422876
http://dx.doi.org/10.1155/2022/7518779
_version_ 1784686427691286528
author Zhao, Shulin
Pan, Qingfeng
Zou, Quan
Ju, Ying
Shi, Lei
Su, Xi
author_facet Zhao, Shulin
Pan, Qingfeng
Zou, Quan
Ju, Ying
Shi, Lei
Su, Xi
author_sort Zhao, Shulin
collection PubMed
description Enhancers are a class of noncoding DNA elements located near structural genes. In recent years, their identification and classification have been the focus of research in the field of bioinformatics. However, due to their high free scattering and position variability, although the performance of the prediction model has been continuously improved, there is still a lot of room for progress. In this paper, density-based spatial clustering of applications with noise (DBSCAN) was used to screen the physicochemical properties of dinucleotides to extract dinucleotide-based auto-cross covariance (DACC) features; then, the features are reduced by feature selection Python toolkit MRMD 2.0. The reduced features are input into the random forest to identify enhancers. The enhancer classification model was built by word2vec and attention-based Bi-LSTM. Finally, the accuracies of our enhancer identification and classification models were 77.25% and 73.50%, respectively, and the Matthews' correlation coefficients (MCCs) were 0.5470 and 0.4881, respectively, which were better than the performance of most predictors.
format Online
Article
Text
id pubmed-9005296
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-90052962022-04-13 Identifying and Classifying Enhancers by Dinucleotide-Based Auto-Cross Covariance and Attention-Based Bi-LSTM Zhao, Shulin Pan, Qingfeng Zou, Quan Ju, Ying Shi, Lei Su, Xi Comput Math Methods Med Research Article Enhancers are a class of noncoding DNA elements located near structural genes. In recent years, their identification and classification have been the focus of research in the field of bioinformatics. However, due to their high free scattering and position variability, although the performance of the prediction model has been continuously improved, there is still a lot of room for progress. In this paper, density-based spatial clustering of applications with noise (DBSCAN) was used to screen the physicochemical properties of dinucleotides to extract dinucleotide-based auto-cross covariance (DACC) features; then, the features are reduced by feature selection Python toolkit MRMD 2.0. The reduced features are input into the random forest to identify enhancers. The enhancer classification model was built by word2vec and attention-based Bi-LSTM. Finally, the accuracies of our enhancer identification and classification models were 77.25% and 73.50%, respectively, and the Matthews' correlation coefficients (MCCs) were 0.5470 and 0.4881, respectively, which were better than the performance of most predictors. Hindawi 2022-04-05 /pmc/articles/PMC9005296/ /pubmed/35422876 http://dx.doi.org/10.1155/2022/7518779 Text en Copyright © 2022 Shulin Zhao et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Zhao, Shulin
Pan, Qingfeng
Zou, Quan
Ju, Ying
Shi, Lei
Su, Xi
Identifying and Classifying Enhancers by Dinucleotide-Based Auto-Cross Covariance and Attention-Based Bi-LSTM
title Identifying and Classifying Enhancers by Dinucleotide-Based Auto-Cross Covariance and Attention-Based Bi-LSTM
title_full Identifying and Classifying Enhancers by Dinucleotide-Based Auto-Cross Covariance and Attention-Based Bi-LSTM
title_fullStr Identifying and Classifying Enhancers by Dinucleotide-Based Auto-Cross Covariance and Attention-Based Bi-LSTM
title_full_unstemmed Identifying and Classifying Enhancers by Dinucleotide-Based Auto-Cross Covariance and Attention-Based Bi-LSTM
title_short Identifying and Classifying Enhancers by Dinucleotide-Based Auto-Cross Covariance and Attention-Based Bi-LSTM
title_sort identifying and classifying enhancers by dinucleotide-based auto-cross covariance and attention-based bi-lstm
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9005296/
https://www.ncbi.nlm.nih.gov/pubmed/35422876
http://dx.doi.org/10.1155/2022/7518779
work_keys_str_mv AT zhaoshulin identifyingandclassifyingenhancersbydinucleotidebasedautocrosscovarianceandattentionbasedbilstm
AT panqingfeng identifyingandclassifyingenhancersbydinucleotidebasedautocrosscovarianceandattentionbasedbilstm
AT zouquan identifyingandclassifyingenhancersbydinucleotidebasedautocrosscovarianceandattentionbasedbilstm
AT juying identifyingandclassifyingenhancersbydinucleotidebasedautocrosscovarianceandattentionbasedbilstm
AT shilei identifyingandclassifyingenhancersbydinucleotidebasedautocrosscovarianceandattentionbasedbilstm
AT suxi identifyingandclassifyingenhancersbydinucleotidebasedautocrosscovarianceandattentionbasedbilstm