Cargando…

Weakly supervised learning of RNA modifications from low-resolution epitranscriptome data

MOTIVATION: Increasing evidence suggests that post-transcriptional ribonucleic acid (RNA) modifications regulate essential biomolecular functions and are related to the pathogenesis of various diseases. Precise identification of RNA modification sites is essential for understanding the regulatory me...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Daiyun, Song, Bowen, Wei, Jingjue, Su, Jionglong, Coenen, Frans, Meng, Jia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8336446/
https://www.ncbi.nlm.nih.gov/pubmed/34252943
http://dx.doi.org/10.1093/bioinformatics/btab278
_version_ 1783733321628385280
author Huang, Daiyun
Song, Bowen
Wei, Jingjue
Su, Jionglong
Coenen, Frans
Meng, Jia
author_facet Huang, Daiyun
Song, Bowen
Wei, Jingjue
Su, Jionglong
Coenen, Frans
Meng, Jia
author_sort Huang, Daiyun
collection PubMed
description MOTIVATION: Increasing evidence suggests that post-transcriptional ribonucleic acid (RNA) modifications regulate essential biomolecular functions and are related to the pathogenesis of various diseases. Precise identification of RNA modification sites is essential for understanding the regulatory mechanisms of RNAs. To date, many computational approaches for predicting RNA modifications have been developed, most of which were based on strong supervision enabled by base-resolution epitranscriptome data. However, high-resolution data may not be available. RESULTS: We propose WeakRM, the first weakly supervised learning framework for predicting RNA modifications from low-resolution epitranscriptome datasets, such as those generated from acRIP-seq and hMeRIP-seq. Evaluations on three independent datasets (corresponding to three different RNA modification types and their respective sequencing technologies) demonstrated the effectiveness of our approach in predicting RNA modifications from low-resolution data. WeakRM outperformed state-of-the-art multi-instance learning methods for genomic sequences, such as WSCNN, which was originally designed for transcription factor binding site prediction. Additionally, our approach captured motifs that are consistent with existing knowledge, and visualization of the predicted modification-containing regions unveiled the potentials of detecting RNA modifications with improved resolution. AVAILABILITY IMPLEMENTATION: The source code for the WeakRM algorithm, along with the datasets used, are freely accessible at: https://github.com/daiyun02211/WeakRM SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8336446
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-83364462021-08-09 Weakly supervised learning of RNA modifications from low-resolution epitranscriptome data Huang, Daiyun Song, Bowen Wei, Jingjue Su, Jionglong Coenen, Frans Meng, Jia Bioinformatics Macromolecular Sequence, Structure, and Function MOTIVATION: Increasing evidence suggests that post-transcriptional ribonucleic acid (RNA) modifications regulate essential biomolecular functions and are related to the pathogenesis of various diseases. Precise identification of RNA modification sites is essential for understanding the regulatory mechanisms of RNAs. To date, many computational approaches for predicting RNA modifications have been developed, most of which were based on strong supervision enabled by base-resolution epitranscriptome data. However, high-resolution data may not be available. RESULTS: We propose WeakRM, the first weakly supervised learning framework for predicting RNA modifications from low-resolution epitranscriptome datasets, such as those generated from acRIP-seq and hMeRIP-seq. Evaluations on three independent datasets (corresponding to three different RNA modification types and their respective sequencing technologies) demonstrated the effectiveness of our approach in predicting RNA modifications from low-resolution data. WeakRM outperformed state-of-the-art multi-instance learning methods for genomic sequences, such as WSCNN, which was originally designed for transcription factor binding site prediction. Additionally, our approach captured motifs that are consistent with existing knowledge, and visualization of the predicted modification-containing regions unveiled the potentials of detecting RNA modifications with improved resolution. AVAILABILITY IMPLEMENTATION: The source code for the WeakRM algorithm, along with the datasets used, are freely accessible at: https://github.com/daiyun02211/WeakRM SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-07-12 /pmc/articles/PMC8336446/ /pubmed/34252943 http://dx.doi.org/10.1093/bioinformatics/btab278 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Macromolecular Sequence, Structure, and Function
Huang, Daiyun
Song, Bowen
Wei, Jingjue
Su, Jionglong
Coenen, Frans
Meng, Jia
Weakly supervised learning of RNA modifications from low-resolution epitranscriptome data
title Weakly supervised learning of RNA modifications from low-resolution epitranscriptome data
title_full Weakly supervised learning of RNA modifications from low-resolution epitranscriptome data
title_fullStr Weakly supervised learning of RNA modifications from low-resolution epitranscriptome data
title_full_unstemmed Weakly supervised learning of RNA modifications from low-resolution epitranscriptome data
title_short Weakly supervised learning of RNA modifications from low-resolution epitranscriptome data
title_sort weakly supervised learning of rna modifications from low-resolution epitranscriptome data
topic Macromolecular Sequence, Structure, and Function
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8336446/
https://www.ncbi.nlm.nih.gov/pubmed/34252943
http://dx.doi.org/10.1093/bioinformatics/btab278
work_keys_str_mv AT huangdaiyun weaklysupervisedlearningofrnamodificationsfromlowresolutionepitranscriptomedata
AT songbowen weaklysupervisedlearningofrnamodificationsfromlowresolutionepitranscriptomedata
AT weijingjue weaklysupervisedlearningofrnamodificationsfromlowresolutionepitranscriptomedata
AT sujionglong weaklysupervisedlearningofrnamodificationsfromlowresolutionepitranscriptomedata
AT coenenfrans weaklysupervisedlearningofrnamodificationsfromlowresolutionepitranscriptomedata
AT mengjia weaklysupervisedlearningofrnamodificationsfromlowresolutionepitranscriptomedata