Cargando…

Splice-site identification for exon prediction using bidirectional LSTM-RNN approach

Machine learning methods played a major role in improving the accuracy of predictions and classification of DNA (Deoxyribonucleic Acid) and protein sequences. In eukaryotes, Splice-site identification and prediction is though not a straightforward job because of numerous false positives. To solve th...

Descripción completa

Detalles Bibliográficos
Autores principales: Singh, Noopur, Nath, Ravindra, Singh, Dev Bukhsh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9157471/
https://www.ncbi.nlm.nih.gov/pubmed/35663929
http://dx.doi.org/10.1016/j.bbrep.2022.101285
_version_ 1784718644316471296
author Singh, Noopur
Nath, Ravindra
Singh, Dev Bukhsh
author_facet Singh, Noopur
Nath, Ravindra
Singh, Dev Bukhsh
author_sort Singh, Noopur
collection PubMed
description Machine learning methods played a major role in improving the accuracy of predictions and classification of DNA (Deoxyribonucleic Acid) and protein sequences. In eukaryotes, Splice-site identification and prediction is though not a straightforward job because of numerous false positives. To solve this problem, here, in this paper, we represent a bidirectional Long Short Term Memory (LSTM) Recurrent Neural Network (RNN) based deep learning model that has been developed to identify and predict the splice-sites for the prediction of exons from eukaryotic DNA sequences. During the splicing mechanism of the primary mRNA transcript, the introns, the non-coding region of the gene are spliced out and the exons, the coding region of the gene are joined. This bidirectional LSTM-RNN model uses the intron features that start with splice site donor (GT) and end with splice site acceptor (AG) in order of its length constraints. The model has been improved by increasing the number of epochs while training. This designed model achieved a maximum accuracy of 95.5%. This model is compatible with huge sequential data such as the complete genome.
format Online
Article
Text
id pubmed-9157471
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-91574712022-06-02 Splice-site identification for exon prediction using bidirectional LSTM-RNN approach Singh, Noopur Nath, Ravindra Singh, Dev Bukhsh Biochem Biophys Rep Research Article Machine learning methods played a major role in improving the accuracy of predictions and classification of DNA (Deoxyribonucleic Acid) and protein sequences. In eukaryotes, Splice-site identification and prediction is though not a straightforward job because of numerous false positives. To solve this problem, here, in this paper, we represent a bidirectional Long Short Term Memory (LSTM) Recurrent Neural Network (RNN) based deep learning model that has been developed to identify and predict the splice-sites for the prediction of exons from eukaryotic DNA sequences. During the splicing mechanism of the primary mRNA transcript, the introns, the non-coding region of the gene are spliced out and the exons, the coding region of the gene are joined. This bidirectional LSTM-RNN model uses the intron features that start with splice site donor (GT) and end with splice site acceptor (AG) in order of its length constraints. The model has been improved by increasing the number of epochs while training. This designed model achieved a maximum accuracy of 95.5%. This model is compatible with huge sequential data such as the complete genome. Elsevier 2022-05-26 /pmc/articles/PMC9157471/ /pubmed/35663929 http://dx.doi.org/10.1016/j.bbrep.2022.101285 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Research Article
Singh, Noopur
Nath, Ravindra
Singh, Dev Bukhsh
Splice-site identification for exon prediction using bidirectional LSTM-RNN approach
title Splice-site identification for exon prediction using bidirectional LSTM-RNN approach
title_full Splice-site identification for exon prediction using bidirectional LSTM-RNN approach
title_fullStr Splice-site identification for exon prediction using bidirectional LSTM-RNN approach
title_full_unstemmed Splice-site identification for exon prediction using bidirectional LSTM-RNN approach
title_short Splice-site identification for exon prediction using bidirectional LSTM-RNN approach
title_sort splice-site identification for exon prediction using bidirectional lstm-rnn approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9157471/
https://www.ncbi.nlm.nih.gov/pubmed/35663929
http://dx.doi.org/10.1016/j.bbrep.2022.101285
work_keys_str_mv AT singhnoopur splicesiteidentificationforexonpredictionusingbidirectionallstmrnnapproach
AT nathravindra splicesiteidentificationforexonpredictionusingbidirectionallstmrnnapproach
AT singhdevbukhsh splicesiteidentificationforexonpredictionusingbidirectionallstmrnnapproach