Cargando…

Modeling multi-species RNA modification through multi-task curriculum learning

N(6)-methyladenosine (m(6)A) is the most pervasive modification in eukaryotic mRNAs. Numerous biological processes are regulated by this critical post-transcriptional mark, such as gene expression, RNA stability, RNA structure and translation. Recently, various experimental techniques and computatio...

Descripción completa

Detalles Bibliográficos
Autores principales: Xiong, Yuanpeng, He, Xuan, Zhao, Dan, Tian, Tingzhong, Hong, Lixiang, Jiang, Tao, Zeng, Jianyang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8053129/
https://www.ncbi.nlm.nih.gov/pubmed/33744973
http://dx.doi.org/10.1093/nar/gkab124
_version_ 1783680058494287872
author Xiong, Yuanpeng
He, Xuan
Zhao, Dan
Tian, Tingzhong
Hong, Lixiang
Jiang, Tao
Zeng, Jianyang
author_facet Xiong, Yuanpeng
He, Xuan
Zhao, Dan
Tian, Tingzhong
Hong, Lixiang
Jiang, Tao
Zeng, Jianyang
author_sort Xiong, Yuanpeng
collection PubMed
description N(6)-methyladenosine (m(6)A) is the most pervasive modification in eukaryotic mRNAs. Numerous biological processes are regulated by this critical post-transcriptional mark, such as gene expression, RNA stability, RNA structure and translation. Recently, various experimental techniques and computational methods have been developed to characterize the transcriptome-wide landscapes of m(6)A modification for understanding its underlying mechanisms and functions in mRNA regulation. However, the experimental techniques are generally costly and time-consuming, while the existing computational models are usually designed only for m(6)A site prediction in a single-species and have significant limitations in accuracy, interpretability and generalizability. Here, we propose a highly interpretable computational framework, called MASS, based on a multi-task curriculum learning strategy to capture m(6)A features across multiple species simultaneously. Extensive computational experiments demonstrate the superior performances of MASS when compared to the state-of-the-art prediction methods. Furthermore, the contextual sequence features of m(6)A captured by MASS can be explained by the known critical binding motifs of the related RNA-binding proteins, which also help elucidate the similarity and difference among m(6)A features across species. In addition, based on the predicted m(6)A profiles, we further delineate the relationships between m(6)A and various properties of gene regulation, including gene expression, RNA stability, translation, RNA structure and histone modification. In summary, MASS may serve as a useful tool for characterizing m(6)A modification and studying its regulatory code. The source code of MASS can be downloaded from https://github.com/mlcb-thu/MASS.
format Online
Article
Text
id pubmed-8053129
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-80531292021-04-21 Modeling multi-species RNA modification through multi-task curriculum learning Xiong, Yuanpeng He, Xuan Zhao, Dan Tian, Tingzhong Hong, Lixiang Jiang, Tao Zeng, Jianyang Nucleic Acids Res Computational Biology N(6)-methyladenosine (m(6)A) is the most pervasive modification in eukaryotic mRNAs. Numerous biological processes are regulated by this critical post-transcriptional mark, such as gene expression, RNA stability, RNA structure and translation. Recently, various experimental techniques and computational methods have been developed to characterize the transcriptome-wide landscapes of m(6)A modification for understanding its underlying mechanisms and functions in mRNA regulation. However, the experimental techniques are generally costly and time-consuming, while the existing computational models are usually designed only for m(6)A site prediction in a single-species and have significant limitations in accuracy, interpretability and generalizability. Here, we propose a highly interpretable computational framework, called MASS, based on a multi-task curriculum learning strategy to capture m(6)A features across multiple species simultaneously. Extensive computational experiments demonstrate the superior performances of MASS when compared to the state-of-the-art prediction methods. Furthermore, the contextual sequence features of m(6)A captured by MASS can be explained by the known critical binding motifs of the related RNA-binding proteins, which also help elucidate the similarity and difference among m(6)A features across species. In addition, based on the predicted m(6)A profiles, we further delineate the relationships between m(6)A and various properties of gene regulation, including gene expression, RNA stability, translation, RNA structure and histone modification. In summary, MASS may serve as a useful tool for characterizing m(6)A modification and studying its regulatory code. The source code of MASS can be downloaded from https://github.com/mlcb-thu/MASS. Oxford University Press 2021-03-21 /pmc/articles/PMC8053129/ /pubmed/33744973 http://dx.doi.org/10.1093/nar/gkab124 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Computational Biology
Xiong, Yuanpeng
He, Xuan
Zhao, Dan
Tian, Tingzhong
Hong, Lixiang
Jiang, Tao
Zeng, Jianyang
Modeling multi-species RNA modification through multi-task curriculum learning
title Modeling multi-species RNA modification through multi-task curriculum learning
title_full Modeling multi-species RNA modification through multi-task curriculum learning
title_fullStr Modeling multi-species RNA modification through multi-task curriculum learning
title_full_unstemmed Modeling multi-species RNA modification through multi-task curriculum learning
title_short Modeling multi-species RNA modification through multi-task curriculum learning
title_sort modeling multi-species rna modification through multi-task curriculum learning
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8053129/
https://www.ncbi.nlm.nih.gov/pubmed/33744973
http://dx.doi.org/10.1093/nar/gkab124
work_keys_str_mv AT xiongyuanpeng modelingmultispeciesrnamodificationthroughmultitaskcurriculumlearning
AT hexuan modelingmultispeciesrnamodificationthroughmultitaskcurriculumlearning
AT zhaodan modelingmultispeciesrnamodificationthroughmultitaskcurriculumlearning
AT tiantingzhong modelingmultispeciesrnamodificationthroughmultitaskcurriculumlearning
AT honglixiang modelingmultispeciesrnamodificationthroughmultitaskcurriculumlearning
AT jiangtao modelingmultispeciesrnamodificationthroughmultitaskcurriculumlearning
AT zengjianyang modelingmultispeciesrnamodificationthroughmultitaskcurriculumlearning