Cargando…

Modeling multi-species RNA modification through multi-task curriculum learning

N(6)-methyladenosine (m(6)A) is the most pervasive modification in eukaryotic mRNAs. Numerous biological processes are regulated by this critical post-transcriptional mark, such as gene expression, RNA stability, RNA structure and translation. Recently, various experimental techniques and computatio...

Descripción completa

Detalles Bibliográficos
Autores principales: Xiong, Yuanpeng, He, Xuan, Zhao, Dan, Tian, Tingzhong, Hong, Lixiang, Jiang, Tao, Zeng, Jianyang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8053129/
https://www.ncbi.nlm.nih.gov/pubmed/33744973
http://dx.doi.org/10.1093/nar/gkab124
Descripción
Sumario:N(6)-methyladenosine (m(6)A) is the most pervasive modification in eukaryotic mRNAs. Numerous biological processes are regulated by this critical post-transcriptional mark, such as gene expression, RNA stability, RNA structure and translation. Recently, various experimental techniques and computational methods have been developed to characterize the transcriptome-wide landscapes of m(6)A modification for understanding its underlying mechanisms and functions in mRNA regulation. However, the experimental techniques are generally costly and time-consuming, while the existing computational models are usually designed only for m(6)A site prediction in a single-species and have significant limitations in accuracy, interpretability and generalizability. Here, we propose a highly interpretable computational framework, called MASS, based on a multi-task curriculum learning strategy to capture m(6)A features across multiple species simultaneously. Extensive computational experiments demonstrate the superior performances of MASS when compared to the state-of-the-art prediction methods. Furthermore, the contextual sequence features of m(6)A captured by MASS can be explained by the known critical binding motifs of the related RNA-binding proteins, which also help elucidate the similarity and difference among m(6)A features across species. In addition, based on the predicted m(6)A profiles, we further delineate the relationships between m(6)A and various properties of gene regulation, including gene expression, RNA stability, translation, RNA structure and histone modification. In summary, MASS may serve as a useful tool for characterizing m(6)A modification and studying its regulatory code. The source code of MASS can be downloaded from https://github.com/mlcb-thu/MASS.