Cargando…
Identifying and Classifying Enhancers by Dinucleotide-Based Auto-Cross Covariance and Attention-Based Bi-LSTM
Enhancers are a class of noncoding DNA elements located near structural genes. In recent years, their identification and classification have been the focus of research in the field of bioinformatics. However, due to their high free scattering and position variability, although the performance of the...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9005296/ https://www.ncbi.nlm.nih.gov/pubmed/35422876 http://dx.doi.org/10.1155/2022/7518779 |
_version_ | 1784686427691286528 |
---|---|
author | Zhao, Shulin Pan, Qingfeng Zou, Quan Ju, Ying Shi, Lei Su, Xi |
author_facet | Zhao, Shulin Pan, Qingfeng Zou, Quan Ju, Ying Shi, Lei Su, Xi |
author_sort | Zhao, Shulin |
collection | PubMed |
description | Enhancers are a class of noncoding DNA elements located near structural genes. In recent years, their identification and classification have been the focus of research in the field of bioinformatics. However, due to their high free scattering and position variability, although the performance of the prediction model has been continuously improved, there is still a lot of room for progress. In this paper, density-based spatial clustering of applications with noise (DBSCAN) was used to screen the physicochemical properties of dinucleotides to extract dinucleotide-based auto-cross covariance (DACC) features; then, the features are reduced by feature selection Python toolkit MRMD 2.0. The reduced features are input into the random forest to identify enhancers. The enhancer classification model was built by word2vec and attention-based Bi-LSTM. Finally, the accuracies of our enhancer identification and classification models were 77.25% and 73.50%, respectively, and the Matthews' correlation coefficients (MCCs) were 0.5470 and 0.4881, respectively, which were better than the performance of most predictors. |
format | Online Article Text |
id | pubmed-9005296 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-90052962022-04-13 Identifying and Classifying Enhancers by Dinucleotide-Based Auto-Cross Covariance and Attention-Based Bi-LSTM Zhao, Shulin Pan, Qingfeng Zou, Quan Ju, Ying Shi, Lei Su, Xi Comput Math Methods Med Research Article Enhancers are a class of noncoding DNA elements located near structural genes. In recent years, their identification and classification have been the focus of research in the field of bioinformatics. However, due to their high free scattering and position variability, although the performance of the prediction model has been continuously improved, there is still a lot of room for progress. In this paper, density-based spatial clustering of applications with noise (DBSCAN) was used to screen the physicochemical properties of dinucleotides to extract dinucleotide-based auto-cross covariance (DACC) features; then, the features are reduced by feature selection Python toolkit MRMD 2.0. The reduced features are input into the random forest to identify enhancers. The enhancer classification model was built by word2vec and attention-based Bi-LSTM. Finally, the accuracies of our enhancer identification and classification models were 77.25% and 73.50%, respectively, and the Matthews' correlation coefficients (MCCs) were 0.5470 and 0.4881, respectively, which were better than the performance of most predictors. Hindawi 2022-04-05 /pmc/articles/PMC9005296/ /pubmed/35422876 http://dx.doi.org/10.1155/2022/7518779 Text en Copyright © 2022 Shulin Zhao et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Zhao, Shulin Pan, Qingfeng Zou, Quan Ju, Ying Shi, Lei Su, Xi Identifying and Classifying Enhancers by Dinucleotide-Based Auto-Cross Covariance and Attention-Based Bi-LSTM |
title | Identifying and Classifying Enhancers by Dinucleotide-Based Auto-Cross Covariance and Attention-Based Bi-LSTM |
title_full | Identifying and Classifying Enhancers by Dinucleotide-Based Auto-Cross Covariance and Attention-Based Bi-LSTM |
title_fullStr | Identifying and Classifying Enhancers by Dinucleotide-Based Auto-Cross Covariance and Attention-Based Bi-LSTM |
title_full_unstemmed | Identifying and Classifying Enhancers by Dinucleotide-Based Auto-Cross Covariance and Attention-Based Bi-LSTM |
title_short | Identifying and Classifying Enhancers by Dinucleotide-Based Auto-Cross Covariance and Attention-Based Bi-LSTM |
title_sort | identifying and classifying enhancers by dinucleotide-based auto-cross covariance and attention-based bi-lstm |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9005296/ https://www.ncbi.nlm.nih.gov/pubmed/35422876 http://dx.doi.org/10.1155/2022/7518779 |
work_keys_str_mv | AT zhaoshulin identifyingandclassifyingenhancersbydinucleotidebasedautocrosscovarianceandattentionbasedbilstm AT panqingfeng identifyingandclassifyingenhancersbydinucleotidebasedautocrosscovarianceandattentionbasedbilstm AT zouquan identifyingandclassifyingenhancersbydinucleotidebasedautocrosscovarianceandattentionbasedbilstm AT juying identifyingandclassifyingenhancersbydinucleotidebasedautocrosscovarianceandattentionbasedbilstm AT shilei identifyingandclassifyingenhancersbydinucleotidebasedautocrosscovarianceandattentionbasedbilstm AT suxi identifyingandclassifyingenhancersbydinucleotidebasedautocrosscovarianceandattentionbasedbilstm |