Cargando…

BrumiR: A toolkit for de novo discovery of microRNAs from sRNA-seq data

MicroRNAs (miRNAs) are small noncoding RNAs that are key players in the regulation of gene expression. In the past decade, with the increasing accessibility of high-throughput sequencing technologies, different methods have been developed to identify miRNAs, most of which rely on preexisting referen...

Descripción completa

Detalles Bibliográficos
Autores principales: Moraga, Carol, Sanchez, Evelyn, Ferrarini, Mariana Galvão, Gutierrez, Rodrigo A, Vidal, Elena A, Sagot, Marie-France
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9596168/
https://www.ncbi.nlm.nih.gov/pubmed/36283679
http://dx.doi.org/10.1093/gigascience/giac093
_version_ 1784815809266188288
author Moraga, Carol
Sanchez, Evelyn
Ferrarini, Mariana Galvão
Gutierrez, Rodrigo A
Vidal, Elena A
Sagot, Marie-France
author_facet Moraga, Carol
Sanchez, Evelyn
Ferrarini, Mariana Galvão
Gutierrez, Rodrigo A
Vidal, Elena A
Sagot, Marie-France
author_sort Moraga, Carol
collection PubMed
description MicroRNAs (miRNAs) are small noncoding RNAs that are key players in the regulation of gene expression. In the past decade, with the increasing accessibility of high-throughput sequencing technologies, different methods have been developed to identify miRNAs, most of which rely on preexisting reference genomes. However, when a reference genome is absent or is not of high quality, such identification becomes more difficult. In this context, we developed BrumiR, an algorithm that is able to discover miRNAs directly and exclusively from small RNA (sRNA) sequencing (sRNA-seq) data. We benchmarked BrumiR with datasets encompassing animal and plant species using real and simulated sRNA-seq experiments. The results demonstrate that BrumiR reaches the highest recall for miRNA discovery, while at the same time being much faster and more efficient than the state-of-the-art tools evaluated. The latter allows BrumiR to analyze a large number of sRNA-seq experiments, from plants or animal species. Moreover, BrumiR detects additional information regarding other expressed sequences (sRNAs, isomiRs, etc.), thus maximizing the biological insight gained from sRNA-seq experiments. Additionally, when a reference genome is available, BrumiR provides a new mapping tool (BrumiR2reference) that performs an a posteriori exhaustive search to identify the precursor sequences. Finally, we also provide a machine learning classifier based on a random forest model that evaluates the sequence-derived features to further refine the prediction obtained from the BrumiR-core. The code of BrumiR and all the algorithms that compose the BrumiR toolkit are freely available at https://github.com/camoragaq/BrumiR.
format Online
Article
Text
id pubmed-9596168
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-95961682022-11-22 BrumiR: A toolkit for de novo discovery of microRNAs from sRNA-seq data Moraga, Carol Sanchez, Evelyn Ferrarini, Mariana Galvão Gutierrez, Rodrigo A Vidal, Elena A Sagot, Marie-France Gigascience Technical Note MicroRNAs (miRNAs) are small noncoding RNAs that are key players in the regulation of gene expression. In the past decade, with the increasing accessibility of high-throughput sequencing technologies, different methods have been developed to identify miRNAs, most of which rely on preexisting reference genomes. However, when a reference genome is absent or is not of high quality, such identification becomes more difficult. In this context, we developed BrumiR, an algorithm that is able to discover miRNAs directly and exclusively from small RNA (sRNA) sequencing (sRNA-seq) data. We benchmarked BrumiR with datasets encompassing animal and plant species using real and simulated sRNA-seq experiments. The results demonstrate that BrumiR reaches the highest recall for miRNA discovery, while at the same time being much faster and more efficient than the state-of-the-art tools evaluated. The latter allows BrumiR to analyze a large number of sRNA-seq experiments, from plants or animal species. Moreover, BrumiR detects additional information regarding other expressed sequences (sRNAs, isomiRs, etc.), thus maximizing the biological insight gained from sRNA-seq experiments. Additionally, when a reference genome is available, BrumiR provides a new mapping tool (BrumiR2reference) that performs an a posteriori exhaustive search to identify the precursor sequences. Finally, we also provide a machine learning classifier based on a random forest model that evaluates the sequence-derived features to further refine the prediction obtained from the BrumiR-core. The code of BrumiR and all the algorithms that compose the BrumiR toolkit are freely available at https://github.com/camoragaq/BrumiR. Oxford University Press 2022-10-25 /pmc/articles/PMC9596168/ /pubmed/36283679 http://dx.doi.org/10.1093/gigascience/giac093 Text en © The Author(s) 2022. Published by Oxford University Press GigaScience. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Moraga, Carol
Sanchez, Evelyn
Ferrarini, Mariana Galvão
Gutierrez, Rodrigo A
Vidal, Elena A
Sagot, Marie-France
BrumiR: A toolkit for de novo discovery of microRNAs from sRNA-seq data
title BrumiR: A toolkit for de novo discovery of microRNAs from sRNA-seq data
title_full BrumiR: A toolkit for de novo discovery of microRNAs from sRNA-seq data
title_fullStr BrumiR: A toolkit for de novo discovery of microRNAs from sRNA-seq data
title_full_unstemmed BrumiR: A toolkit for de novo discovery of microRNAs from sRNA-seq data
title_short BrumiR: A toolkit for de novo discovery of microRNAs from sRNA-seq data
title_sort brumir: a toolkit for de novo discovery of micrornas from srna-seq data
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9596168/
https://www.ncbi.nlm.nih.gov/pubmed/36283679
http://dx.doi.org/10.1093/gigascience/giac093
work_keys_str_mv AT moragacarol brumiratoolkitfordenovodiscoveryofmicrornasfromsrnaseqdata
AT sanchezevelyn brumiratoolkitfordenovodiscoveryofmicrornasfromsrnaseqdata
AT ferrarinimarianagalvao brumiratoolkitfordenovodiscoveryofmicrornasfromsrnaseqdata
AT gutierrezrodrigoa brumiratoolkitfordenovodiscoveryofmicrornasfromsrnaseqdata
AT vidalelenaa brumiratoolkitfordenovodiscoveryofmicrornasfromsrnaseqdata
AT sagotmariefrance brumiratoolkitfordenovodiscoveryofmicrornasfromsrnaseqdata