Cargando…
Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests
The discovery of microRNAs (miRNAs) remains an important problem, particularly given the growth of high-throughput sequencing, cell sorting and single cell biology. While a large number of miRNAs have already been annotated, there may well be large numbers of miRNAs that are expressed in very partic...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5716205/ https://www.ncbi.nlm.nih.gov/pubmed/29036314 http://dx.doi.org/10.1093/nar/gkx836 |
_version_ | 1783283899070152704 |
---|---|
author | Vitsios, Dimitrios M. Kentepozidou, Elissavet Quintais, Leonor Benito-Gutiérrez, Elia van Dongen, Stijn Davis, Matthew P. Enright, Anton J. |
author_facet | Vitsios, Dimitrios M. Kentepozidou, Elissavet Quintais, Leonor Benito-Gutiérrez, Elia van Dongen, Stijn Davis, Matthew P. Enright, Anton J. |
author_sort | Vitsios, Dimitrios M. |
collection | PubMed |
description | The discovery of microRNAs (miRNAs) remains an important problem, particularly given the growth of high-throughput sequencing, cell sorting and single cell biology. While a large number of miRNAs have already been annotated, there may well be large numbers of miRNAs that are expressed in very particular cell types and remain elusive. Sequencing allows us to quickly and accurately identify the expression of known miRNAs from small RNA-Seq data. The biogenesis of miRNAs leads to very specific characteristics observed in their sequences. In brief, miRNAs usually have a well-defined 5′ end and a more flexible 3′ end with the possibility of 3′ tailing events, such as uridylation. Previous approaches to the prediction of novel miRNAs usually involve the analysis of structural features of miRNA precursor hairpin sequences obtained from genome sequence. We surmised that it may be possible to identify miRNAs by using these biogenesis features observed directly from sequenced reads, solely or in addition to structural analysis from genome data. To this end, we have developed mirnovo, a machine learning based algorithm, which is able to identify known and novel miRNAs in animals and plants directly from small RNA-Seq data, with or without a reference genome. This method performs comparably to existing tools, however is simpler to use with reduced run time. Its performance and accuracy has been tested on multiple datasets, including species with poorly assembled genomes, RNaseIII (Drosha and/or Dicer) deficient samples and single cells (at both embryonic and adult stage). |
format | Online Article Text |
id | pubmed-5716205 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-57162052017-12-08 Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests Vitsios, Dimitrios M. Kentepozidou, Elissavet Quintais, Leonor Benito-Gutiérrez, Elia van Dongen, Stijn Davis, Matthew P. Enright, Anton J. Nucleic Acids Res Methods Online The discovery of microRNAs (miRNAs) remains an important problem, particularly given the growth of high-throughput sequencing, cell sorting and single cell biology. While a large number of miRNAs have already been annotated, there may well be large numbers of miRNAs that are expressed in very particular cell types and remain elusive. Sequencing allows us to quickly and accurately identify the expression of known miRNAs from small RNA-Seq data. The biogenesis of miRNAs leads to very specific characteristics observed in their sequences. In brief, miRNAs usually have a well-defined 5′ end and a more flexible 3′ end with the possibility of 3′ tailing events, such as uridylation. Previous approaches to the prediction of novel miRNAs usually involve the analysis of structural features of miRNA precursor hairpin sequences obtained from genome sequence. We surmised that it may be possible to identify miRNAs by using these biogenesis features observed directly from sequenced reads, solely or in addition to structural analysis from genome data. To this end, we have developed mirnovo, a machine learning based algorithm, which is able to identify known and novel miRNAs in animals and plants directly from small RNA-Seq data, with or without a reference genome. This method performs comparably to existing tools, however is simpler to use with reduced run time. Its performance and accuracy has been tested on multiple datasets, including species with poorly assembled genomes, RNaseIII (Drosha and/or Dicer) deficient samples and single cells (at both embryonic and adult stage). Oxford University Press 2017-12-01 2017-09-25 /pmc/articles/PMC5716205/ /pubmed/29036314 http://dx.doi.org/10.1093/nar/gkx836 Text en © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Online Vitsios, Dimitrios M. Kentepozidou, Elissavet Quintais, Leonor Benito-Gutiérrez, Elia van Dongen, Stijn Davis, Matthew P. Enright, Anton J. Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests |
title | Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests |
title_full | Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests |
title_fullStr | Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests |
title_full_unstemmed | Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests |
title_short | Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests |
title_sort | mirnovo: genome-free prediction of micrornas from small rna sequencing data and single-cells using decision forests |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5716205/ https://www.ncbi.nlm.nih.gov/pubmed/29036314 http://dx.doi.org/10.1093/nar/gkx836 |
work_keys_str_mv | AT vitsiosdimitriosm mirnovogenomefreepredictionofmicrornasfromsmallrnasequencingdataandsinglecellsusingdecisionforests AT kentepozidouelissavet mirnovogenomefreepredictionofmicrornasfromsmallrnasequencingdataandsinglecellsusingdecisionforests AT quintaisleonor mirnovogenomefreepredictionofmicrornasfromsmallrnasequencingdataandsinglecellsusingdecisionforests AT benitogutierrezelia mirnovogenomefreepredictionofmicrornasfromsmallrnasequencingdataandsinglecellsusingdecisionforests AT vandongenstijn mirnovogenomefreepredictionofmicrornasfromsmallrnasequencingdataandsinglecellsusingdecisionforests AT davismatthewp mirnovogenomefreepredictionofmicrornasfromsmallrnasequencingdataandsinglecellsusingdecisionforests AT enrightantonj mirnovogenomefreepredictionofmicrornasfromsmallrnasequencingdataandsinglecellsusingdecisionforests |