Cargando…
Modeling multi-species RNA modification through multi-task curriculum learning
N(6)-methyladenosine (m(6)A) is the most pervasive modification in eukaryotic mRNAs. Numerous biological processes are regulated by this critical post-transcriptional mark, such as gene expression, RNA stability, RNA structure and translation. Recently, various experimental techniques and computatio...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8053129/ https://www.ncbi.nlm.nih.gov/pubmed/33744973 http://dx.doi.org/10.1093/nar/gkab124 |
_version_ | 1783680058494287872 |
---|---|
author | Xiong, Yuanpeng He, Xuan Zhao, Dan Tian, Tingzhong Hong, Lixiang Jiang, Tao Zeng, Jianyang |
author_facet | Xiong, Yuanpeng He, Xuan Zhao, Dan Tian, Tingzhong Hong, Lixiang Jiang, Tao Zeng, Jianyang |
author_sort | Xiong, Yuanpeng |
collection | PubMed |
description | N(6)-methyladenosine (m(6)A) is the most pervasive modification in eukaryotic mRNAs. Numerous biological processes are regulated by this critical post-transcriptional mark, such as gene expression, RNA stability, RNA structure and translation. Recently, various experimental techniques and computational methods have been developed to characterize the transcriptome-wide landscapes of m(6)A modification for understanding its underlying mechanisms and functions in mRNA regulation. However, the experimental techniques are generally costly and time-consuming, while the existing computational models are usually designed only for m(6)A site prediction in a single-species and have significant limitations in accuracy, interpretability and generalizability. Here, we propose a highly interpretable computational framework, called MASS, based on a multi-task curriculum learning strategy to capture m(6)A features across multiple species simultaneously. Extensive computational experiments demonstrate the superior performances of MASS when compared to the state-of-the-art prediction methods. Furthermore, the contextual sequence features of m(6)A captured by MASS can be explained by the known critical binding motifs of the related RNA-binding proteins, which also help elucidate the similarity and difference among m(6)A features across species. In addition, based on the predicted m(6)A profiles, we further delineate the relationships between m(6)A and various properties of gene regulation, including gene expression, RNA stability, translation, RNA structure and histone modification. In summary, MASS may serve as a useful tool for characterizing m(6)A modification and studying its regulatory code. The source code of MASS can be downloaded from https://github.com/mlcb-thu/MASS. |
format | Online Article Text |
id | pubmed-8053129 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-80531292021-04-21 Modeling multi-species RNA modification through multi-task curriculum learning Xiong, Yuanpeng He, Xuan Zhao, Dan Tian, Tingzhong Hong, Lixiang Jiang, Tao Zeng, Jianyang Nucleic Acids Res Computational Biology N(6)-methyladenosine (m(6)A) is the most pervasive modification in eukaryotic mRNAs. Numerous biological processes are regulated by this critical post-transcriptional mark, such as gene expression, RNA stability, RNA structure and translation. Recently, various experimental techniques and computational methods have been developed to characterize the transcriptome-wide landscapes of m(6)A modification for understanding its underlying mechanisms and functions in mRNA regulation. However, the experimental techniques are generally costly and time-consuming, while the existing computational models are usually designed only for m(6)A site prediction in a single-species and have significant limitations in accuracy, interpretability and generalizability. Here, we propose a highly interpretable computational framework, called MASS, based on a multi-task curriculum learning strategy to capture m(6)A features across multiple species simultaneously. Extensive computational experiments demonstrate the superior performances of MASS when compared to the state-of-the-art prediction methods. Furthermore, the contextual sequence features of m(6)A captured by MASS can be explained by the known critical binding motifs of the related RNA-binding proteins, which also help elucidate the similarity and difference among m(6)A features across species. In addition, based on the predicted m(6)A profiles, we further delineate the relationships between m(6)A and various properties of gene regulation, including gene expression, RNA stability, translation, RNA structure and histone modification. In summary, MASS may serve as a useful tool for characterizing m(6)A modification and studying its regulatory code. The source code of MASS can be downloaded from https://github.com/mlcb-thu/MASS. Oxford University Press 2021-03-21 /pmc/articles/PMC8053129/ /pubmed/33744973 http://dx.doi.org/10.1093/nar/gkab124 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Computational Biology Xiong, Yuanpeng He, Xuan Zhao, Dan Tian, Tingzhong Hong, Lixiang Jiang, Tao Zeng, Jianyang Modeling multi-species RNA modification through multi-task curriculum learning |
title | Modeling multi-species RNA modification through multi-task curriculum learning |
title_full | Modeling multi-species RNA modification through multi-task curriculum learning |
title_fullStr | Modeling multi-species RNA modification through multi-task curriculum learning |
title_full_unstemmed | Modeling multi-species RNA modification through multi-task curriculum learning |
title_short | Modeling multi-species RNA modification through multi-task curriculum learning |
title_sort | modeling multi-species rna modification through multi-task curriculum learning |
topic | Computational Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8053129/ https://www.ncbi.nlm.nih.gov/pubmed/33744973 http://dx.doi.org/10.1093/nar/gkab124 |
work_keys_str_mv | AT xiongyuanpeng modelingmultispeciesrnamodificationthroughmultitaskcurriculumlearning AT hexuan modelingmultispeciesrnamodificationthroughmultitaskcurriculumlearning AT zhaodan modelingmultispeciesrnamodificationthroughmultitaskcurriculumlearning AT tiantingzhong modelingmultispeciesrnamodificationthroughmultitaskcurriculumlearning AT honglixiang modelingmultispeciesrnamodificationthroughmultitaskcurriculumlearning AT jiangtao modelingmultispeciesrnamodificationthroughmultitaskcurriculumlearning AT zengjianyang modelingmultispeciesrnamodificationthroughmultitaskcurriculumlearning |