Cargando…
Geographic encoding of transcripts enabled high-accuracy and isoform-aware deep learning of RNA methylation
As the most pervasive epigenetic mark present on mRNA and lncRNA, N(6)-methyladenosine (m(6)A) RNA methylation regulates all stages of RNA life in various biological processes and disease mechanisms. Computational methods for deciphering RNA modification have achieved great success in recent years;...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9561283/ https://www.ncbi.nlm.nih.gov/pubmed/36155798 http://dx.doi.org/10.1093/nar/gkac830 |
_version_ | 1784807918226374656 |
---|---|
author | Huang, Daiyun Chen, Kunqi Song, Bowen Wei, Zhen Su, Jionglong Coenen, Frans de Magalhães, João Pedro Rigden, Daniel J Meng, Jia |
author_facet | Huang, Daiyun Chen, Kunqi Song, Bowen Wei, Zhen Su, Jionglong Coenen, Frans de Magalhães, João Pedro Rigden, Daniel J Meng, Jia |
author_sort | Huang, Daiyun |
collection | PubMed |
description | As the most pervasive epigenetic mark present on mRNA and lncRNA, N(6)-methyladenosine (m(6)A) RNA methylation regulates all stages of RNA life in various biological processes and disease mechanisms. Computational methods for deciphering RNA modification have achieved great success in recent years; nevertheless, their potential remains underexploited. One reason for this is that existing models usually consider only the sequence of transcripts, ignoring the various regions (or geography) of transcripts such as 3′UTR and intron, where the epigenetic mark forms and functions. Here, we developed three simple yet powerful encoding schemes for transcripts to capture the submolecular geographic information of RNA, which is largely independent from sequences. We show that m(6)A prediction models based on geographic information alone can achieve comparable performances to classic sequence-based methods. Importantly, geographic information substantially enhances the accuracy of sequence-based models, enables isoform- and tissue-specific prediction of m(6)A sites, and improves m(6)A signal detection from direct RNA sequencing data. The geographic encoding schemes we developed have exhibited strong interpretability, and are applicable to not only m(6)A but also N(1)-methyladenosine (m(1)A), and can serve as a general and effective complement to the widely used sequence encoding schemes in deep learning applications concerning RNA transcripts. |
format | Online Article Text |
id | pubmed-9561283 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-95612832022-10-18 Geographic encoding of transcripts enabled high-accuracy and isoform-aware deep learning of RNA methylation Huang, Daiyun Chen, Kunqi Song, Bowen Wei, Zhen Su, Jionglong Coenen, Frans de Magalhães, João Pedro Rigden, Daniel J Meng, Jia Nucleic Acids Res Computational Biology As the most pervasive epigenetic mark present on mRNA and lncRNA, N(6)-methyladenosine (m(6)A) RNA methylation regulates all stages of RNA life in various biological processes and disease mechanisms. Computational methods for deciphering RNA modification have achieved great success in recent years; nevertheless, their potential remains underexploited. One reason for this is that existing models usually consider only the sequence of transcripts, ignoring the various regions (or geography) of transcripts such as 3′UTR and intron, where the epigenetic mark forms and functions. Here, we developed three simple yet powerful encoding schemes for transcripts to capture the submolecular geographic information of RNA, which is largely independent from sequences. We show that m(6)A prediction models based on geographic information alone can achieve comparable performances to classic sequence-based methods. Importantly, geographic information substantially enhances the accuracy of sequence-based models, enables isoform- and tissue-specific prediction of m(6)A sites, and improves m(6)A signal detection from direct RNA sequencing data. The geographic encoding schemes we developed have exhibited strong interpretability, and are applicable to not only m(6)A but also N(1)-methyladenosine (m(1)A), and can serve as a general and effective complement to the widely used sequence encoding schemes in deep learning applications concerning RNA transcripts. Oxford University Press 2022-09-26 /pmc/articles/PMC9561283/ /pubmed/36155798 http://dx.doi.org/10.1093/nar/gkac830 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Computational Biology Huang, Daiyun Chen, Kunqi Song, Bowen Wei, Zhen Su, Jionglong Coenen, Frans de Magalhães, João Pedro Rigden, Daniel J Meng, Jia Geographic encoding of transcripts enabled high-accuracy and isoform-aware deep learning of RNA methylation |
title | Geographic encoding of transcripts enabled high-accuracy and isoform-aware deep learning of RNA methylation |
title_full | Geographic encoding of transcripts enabled high-accuracy and isoform-aware deep learning of RNA methylation |
title_fullStr | Geographic encoding of transcripts enabled high-accuracy and isoform-aware deep learning of RNA methylation |
title_full_unstemmed | Geographic encoding of transcripts enabled high-accuracy and isoform-aware deep learning of RNA methylation |
title_short | Geographic encoding of transcripts enabled high-accuracy and isoform-aware deep learning of RNA methylation |
title_sort | geographic encoding of transcripts enabled high-accuracy and isoform-aware deep learning of rna methylation |
topic | Computational Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9561283/ https://www.ncbi.nlm.nih.gov/pubmed/36155798 http://dx.doi.org/10.1093/nar/gkac830 |
work_keys_str_mv | AT huangdaiyun geographicencodingoftranscriptsenabledhighaccuracyandisoformawaredeeplearningofrnamethylation AT chenkunqi geographicencodingoftranscriptsenabledhighaccuracyandisoformawaredeeplearningofrnamethylation AT songbowen geographicencodingoftranscriptsenabledhighaccuracyandisoformawaredeeplearningofrnamethylation AT weizhen geographicencodingoftranscriptsenabledhighaccuracyandisoformawaredeeplearningofrnamethylation AT sujionglong geographicencodingoftranscriptsenabledhighaccuracyandisoformawaredeeplearningofrnamethylation AT coenenfrans geographicencodingoftranscriptsenabledhighaccuracyandisoformawaredeeplearningofrnamethylation AT demagalhaesjoaopedro geographicencodingoftranscriptsenabledhighaccuracyandisoformawaredeeplearningofrnamethylation AT rigdendanielj geographicencodingoftranscriptsenabledhighaccuracyandisoformawaredeeplearningofrnamethylation AT mengjia geographicencodingoftranscriptsenabledhighaccuracyandisoformawaredeeplearningofrnamethylation |