Cargando…
Splice-site identification for exon prediction using bidirectional LSTM-RNN approach
Machine learning methods played a major role in improving the accuracy of predictions and classification of DNA (Deoxyribonucleic Acid) and protein sequences. In eukaryotes, Splice-site identification and prediction is though not a straightforward job because of numerous false positives. To solve th...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9157471/ https://www.ncbi.nlm.nih.gov/pubmed/35663929 http://dx.doi.org/10.1016/j.bbrep.2022.101285 |
Sumario: | Machine learning methods played a major role in improving the accuracy of predictions and classification of DNA (Deoxyribonucleic Acid) and protein sequences. In eukaryotes, Splice-site identification and prediction is though not a straightforward job because of numerous false positives. To solve this problem, here, in this paper, we represent a bidirectional Long Short Term Memory (LSTM) Recurrent Neural Network (RNN) based deep learning model that has been developed to identify and predict the splice-sites for the prediction of exons from eukaryotic DNA sequences. During the splicing mechanism of the primary mRNA transcript, the introns, the non-coding region of the gene are spliced out and the exons, the coding region of the gene are joined. This bidirectional LSTM-RNN model uses the intron features that start with splice site donor (GT) and end with splice site acceptor (AG) in order of its length constraints. The model has been improved by increasing the number of epochs while training. This designed model achieved a maximum accuracy of 95.5%. This model is compatible with huge sequential data such as the complete genome. |
---|