Cargando…

Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests

The discovery of microRNAs (miRNAs) remains an important problem, particularly given the growth of high-throughput sequencing, cell sorting and single cell biology. While a large number of miRNAs have already been annotated, there may well be large numbers of miRNAs that are expressed in very partic...

Descripción completa

Detalles Bibliográficos
Autores principales: Vitsios, Dimitrios M., Kentepozidou, Elissavet, Quintais, Leonor, Benito-Gutiérrez, Elia, van Dongen, Stijn, Davis, Matthew P., Enright, Anton J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5716205/
https://www.ncbi.nlm.nih.gov/pubmed/29036314
http://dx.doi.org/10.1093/nar/gkx836
_version_ 1783283899070152704
author Vitsios, Dimitrios M.
Kentepozidou, Elissavet
Quintais, Leonor
Benito-Gutiérrez, Elia
van Dongen, Stijn
Davis, Matthew P.
Enright, Anton J.
author_facet Vitsios, Dimitrios M.
Kentepozidou, Elissavet
Quintais, Leonor
Benito-Gutiérrez, Elia
van Dongen, Stijn
Davis, Matthew P.
Enright, Anton J.
author_sort Vitsios, Dimitrios M.
collection PubMed
description The discovery of microRNAs (miRNAs) remains an important problem, particularly given the growth of high-throughput sequencing, cell sorting and single cell biology. While a large number of miRNAs have already been annotated, there may well be large numbers of miRNAs that are expressed in very particular cell types and remain elusive. Sequencing allows us to quickly and accurately identify the expression of known miRNAs from small RNA-Seq data. The biogenesis of miRNAs leads to very specific characteristics observed in their sequences. In brief, miRNAs usually have a well-defined 5′ end and a more flexible 3′ end with the possibility of 3′ tailing events, such as uridylation. Previous approaches to the prediction of novel miRNAs usually involve the analysis of structural features of miRNA precursor hairpin sequences obtained from genome sequence. We surmised that it may be possible to identify miRNAs by using these biogenesis features observed directly from sequenced reads, solely or in addition to structural analysis from genome data. To this end, we have developed mirnovo, a machine learning based algorithm, which is able to identify known and novel miRNAs in animals and plants directly from small RNA-Seq data, with or without a reference genome. This method performs comparably to existing tools, however is simpler to use with reduced run time. Its performance and accuracy has been tested on multiple datasets, including species with poorly assembled genomes, RNaseIII (Drosha and/or Dicer) deficient samples and single cells (at both embryonic and adult stage).
format Online
Article
Text
id pubmed-5716205
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-57162052017-12-08 Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests Vitsios, Dimitrios M. Kentepozidou, Elissavet Quintais, Leonor Benito-Gutiérrez, Elia van Dongen, Stijn Davis, Matthew P. Enright, Anton J. Nucleic Acids Res Methods Online The discovery of microRNAs (miRNAs) remains an important problem, particularly given the growth of high-throughput sequencing, cell sorting and single cell biology. While a large number of miRNAs have already been annotated, there may well be large numbers of miRNAs that are expressed in very particular cell types and remain elusive. Sequencing allows us to quickly and accurately identify the expression of known miRNAs from small RNA-Seq data. The biogenesis of miRNAs leads to very specific characteristics observed in their sequences. In brief, miRNAs usually have a well-defined 5′ end and a more flexible 3′ end with the possibility of 3′ tailing events, such as uridylation. Previous approaches to the prediction of novel miRNAs usually involve the analysis of structural features of miRNA precursor hairpin sequences obtained from genome sequence. We surmised that it may be possible to identify miRNAs by using these biogenesis features observed directly from sequenced reads, solely or in addition to structural analysis from genome data. To this end, we have developed mirnovo, a machine learning based algorithm, which is able to identify known and novel miRNAs in animals and plants directly from small RNA-Seq data, with or without a reference genome. This method performs comparably to existing tools, however is simpler to use with reduced run time. Its performance and accuracy has been tested on multiple datasets, including species with poorly assembled genomes, RNaseIII (Drosha and/or Dicer) deficient samples and single cells (at both embryonic and adult stage). Oxford University Press 2017-12-01 2017-09-25 /pmc/articles/PMC5716205/ /pubmed/29036314 http://dx.doi.org/10.1093/nar/gkx836 Text en © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Vitsios, Dimitrios M.
Kentepozidou, Elissavet
Quintais, Leonor
Benito-Gutiérrez, Elia
van Dongen, Stijn
Davis, Matthew P.
Enright, Anton J.
Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests
title Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests
title_full Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests
title_fullStr Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests
title_full_unstemmed Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests
title_short Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests
title_sort mirnovo: genome-free prediction of micrornas from small rna sequencing data and single-cells using decision forests
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5716205/
https://www.ncbi.nlm.nih.gov/pubmed/29036314
http://dx.doi.org/10.1093/nar/gkx836
work_keys_str_mv AT vitsiosdimitriosm mirnovogenomefreepredictionofmicrornasfromsmallrnasequencingdataandsinglecellsusingdecisionforests
AT kentepozidouelissavet mirnovogenomefreepredictionofmicrornasfromsmallrnasequencingdataandsinglecellsusingdecisionforests
AT quintaisleonor mirnovogenomefreepredictionofmicrornasfromsmallrnasequencingdataandsinglecellsusingdecisionforests
AT benitogutierrezelia mirnovogenomefreepredictionofmicrornasfromsmallrnasequencingdataandsinglecellsusingdecisionforests
AT vandongenstijn mirnovogenomefreepredictionofmicrornasfromsmallrnasequencingdataandsinglecellsusingdecisionforests
AT davismatthewp mirnovogenomefreepredictionofmicrornasfromsmallrnasequencingdataandsinglecellsusingdecisionforests
AT enrightantonj mirnovogenomefreepredictionofmicrornasfromsmallrnasequencingdataandsinglecellsusingdecisionforests