Cargando…

Attention-Based Deep Multiple-Instance Learning for Classifying Circular RNA and Other Long Non-Coding RNA

Circular RNA (circRNA) is a distinguishable circular formed long non-coding RNA (lncRNA), which has specific roles in transcriptional regulation, multiple biological processes. The identification of circRNA from other lncRNA is necessary for relevant research. In this study, we designed attention-ba...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yunhe, Fu, Qiqing, Peng, Xueqing, Zhu, Chaoyu, Liu, Gang, Liu, Lei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8701965/
https://www.ncbi.nlm.nih.gov/pubmed/34946967
http://dx.doi.org/10.3390/genes12122018
Descripción
Sumario:Circular RNA (circRNA) is a distinguishable circular formed long non-coding RNA (lncRNA), which has specific roles in transcriptional regulation, multiple biological processes. The identification of circRNA from other lncRNA is necessary for relevant research. In this study, we designed attention-based multi-instance learning (MIL) network architecture fed with a raw sequence, to learn the sparse features of RNA sequences and to accomplish the circRNAs identification task. The model outperformed the state-of-art models. Moreover, following the validation of the attention mechanism effectiveness by the handwritten digit dataset, the key sequence loci underlying circRNA’s recognition were obtained based on the corresponding attention score. Then, motif enrichment analysis identified some of the key motifs for circRNA formation. In conclusion, we designed deep learning network architecture suitable for learning gene sequences with sparse features and implemented it for the circRNA identification task, and the model has strong representation capability in the indication of some key loci.