Cargando…

Deep learning predicts short non-coding RNA functions from only raw sequence data

Small non-coding RNAs (ncRNAs) are short non-coding sequences involved in gene regulation in many biological processes and diseases. The lack of a complete comprehension of their biological functionality, especially in a genome-wide scenario, has demanded new computational approaches to annotate the...

Descripción completa

Detalles Bibliográficos
Autores principales: Noviello, Teresa Maria Rosaria, Ceccarelli, Francesco, Ceccarelli, Michele, Cerulo, Luigi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7682815/
https://www.ncbi.nlm.nih.gov/pubmed/33175836
http://dx.doi.org/10.1371/journal.pcbi.1008415
_version_ 1783612752721346560
author Noviello, Teresa Maria Rosaria
Ceccarelli, Francesco
Ceccarelli, Michele
Cerulo, Luigi
author_facet Noviello, Teresa Maria Rosaria
Ceccarelli, Francesco
Ceccarelli, Michele
Cerulo, Luigi
author_sort Noviello, Teresa Maria Rosaria
collection PubMed
description Small non-coding RNAs (ncRNAs) are short non-coding sequences involved in gene regulation in many biological processes and diseases. The lack of a complete comprehension of their biological functionality, especially in a genome-wide scenario, has demanded new computational approaches to annotate their roles. It is widely known that secondary structure is determinant to know RNA function and machine learning based approaches have been successfully proven to predict RNA function from secondary structure information. Here we show that RNA function can be predicted with good accuracy from a lightweight representation of sequence information without the necessity of computing secondary structure features which is computationally expensive. This finding appears to go against the dogma of secondary structure being a key determinant of function in RNA. Compared to recent secondary structure based methods, the proposed solution is more robust to sequence boundary noise and reduces drastically the computational cost allowing for large data volume annotations. Scripts and datasets to reproduce the results of experiments proposed in this study are available at: https://github.com/bioinformatics-sannio/ncrna-deep.
format Online
Article
Text
id pubmed-7682815
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-76828152020-12-02 Deep learning predicts short non-coding RNA functions from only raw sequence data Noviello, Teresa Maria Rosaria Ceccarelli, Francesco Ceccarelli, Michele Cerulo, Luigi PLoS Comput Biol Research Article Small non-coding RNAs (ncRNAs) are short non-coding sequences involved in gene regulation in many biological processes and diseases. The lack of a complete comprehension of their biological functionality, especially in a genome-wide scenario, has demanded new computational approaches to annotate their roles. It is widely known that secondary structure is determinant to know RNA function and machine learning based approaches have been successfully proven to predict RNA function from secondary structure information. Here we show that RNA function can be predicted with good accuracy from a lightweight representation of sequence information without the necessity of computing secondary structure features which is computationally expensive. This finding appears to go against the dogma of secondary structure being a key determinant of function in RNA. Compared to recent secondary structure based methods, the proposed solution is more robust to sequence boundary noise and reduces drastically the computational cost allowing for large data volume annotations. Scripts and datasets to reproduce the results of experiments proposed in this study are available at: https://github.com/bioinformatics-sannio/ncrna-deep. Public Library of Science 2020-11-11 /pmc/articles/PMC7682815/ /pubmed/33175836 http://dx.doi.org/10.1371/journal.pcbi.1008415 Text en © 2020 Noviello et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Noviello, Teresa Maria Rosaria
Ceccarelli, Francesco
Ceccarelli, Michele
Cerulo, Luigi
Deep learning predicts short non-coding RNA functions from only raw sequence data
title Deep learning predicts short non-coding RNA functions from only raw sequence data
title_full Deep learning predicts short non-coding RNA functions from only raw sequence data
title_fullStr Deep learning predicts short non-coding RNA functions from only raw sequence data
title_full_unstemmed Deep learning predicts short non-coding RNA functions from only raw sequence data
title_short Deep learning predicts short non-coding RNA functions from only raw sequence data
title_sort deep learning predicts short non-coding rna functions from only raw sequence data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7682815/
https://www.ncbi.nlm.nih.gov/pubmed/33175836
http://dx.doi.org/10.1371/journal.pcbi.1008415
work_keys_str_mv AT novielloteresamariarosaria deeplearningpredictsshortnoncodingrnafunctionsfromonlyrawsequencedata
AT ceccarellifrancesco deeplearningpredictsshortnoncodingrnafunctionsfromonlyrawsequencedata
AT ceccarellimichele deeplearningpredictsshortnoncodingrnafunctionsfromonlyrawsequencedata
AT ceruloluigi deeplearningpredictsshortnoncodingrnafunctionsfromonlyrawsequencedata