Cargando…

Characterization of statistical features for plant microRNA prediction

BACKGROUND: Several tools are available to identify miRNAs from deep-sequencing data, however, only a few of them, like miRDeep, can identify novel miRNAs and are also available as a standalone application. Given the difference between plant and animal miRNAs, particularly in terms of distribution o...

Descripción completa

Detalles Bibliográficos
Autores principales: Thakur, Vivek, Wanchana, Samart, Xu, Mercedes, Bruskiewich, Richard, Quick, William Paul, Mosig, Axel, Zhu, Xin-Guang
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3053258/
https://www.ncbi.nlm.nih.gov/pubmed/21324149
http://dx.doi.org/10.1186/1471-2164-12-108
_version_ 1782199720174682112
author Thakur, Vivek
Wanchana, Samart
Xu, Mercedes
Bruskiewich, Richard
Quick, William Paul
Mosig, Axel
Zhu, Xin-Guang
author_facet Thakur, Vivek
Wanchana, Samart
Xu, Mercedes
Bruskiewich, Richard
Quick, William Paul
Mosig, Axel
Zhu, Xin-Guang
author_sort Thakur, Vivek
collection PubMed
description BACKGROUND: Several tools are available to identify miRNAs from deep-sequencing data, however, only a few of them, like miRDeep, can identify novel miRNAs and are also available as a standalone application. Given the difference between plant and animal miRNAs, particularly in terms of distribution of hairpin length and the nature of complementarity with its duplex partner (or miRNA star), the underlying (statistical) features of miRDeep and other tools, using similar features, are likely to get affected. RESULTS: The potential effects on features, such as minimum free energy, stability of secondary structures, excision length, etc., were examined, and the parameters of those displaying sizable changes were estimated for plant specific miRNAs. We found most of these features acquired a new set of values or distributions for plant specific miRNAs. While the length of conserved positions (nucleus) in mature miRNAs were relatively longer in plants, the difference in distribution of minimum free energy, between real and background hairpins, was marginal. However, the choice of source (species) of background sequences was found to affect both the minimum free energy and miRNA hairpin stability. The new parameters were tested on an Illumina dataset from maize seedlings, and the results were compared with those obtained using default parameters. The newly parameterized model was found to have much improved specificity and sensitivity over its default counterpart. CONCLUSIONS: In summary, the present study reports behavior of few general and tool-specific statistical features for improving the prediction accuracy of plant miRNAs from deep-sequencing data.
format Text
id pubmed-3053258
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30532582011-04-06 Characterization of statistical features for plant microRNA prediction Thakur, Vivek Wanchana, Samart Xu, Mercedes Bruskiewich, Richard Quick, William Paul Mosig, Axel Zhu, Xin-Guang BMC Genomics Methodology Article BACKGROUND: Several tools are available to identify miRNAs from deep-sequencing data, however, only a few of them, like miRDeep, can identify novel miRNAs and are also available as a standalone application. Given the difference between plant and animal miRNAs, particularly in terms of distribution of hairpin length and the nature of complementarity with its duplex partner (or miRNA star), the underlying (statistical) features of miRDeep and other tools, using similar features, are likely to get affected. RESULTS: The potential effects on features, such as minimum free energy, stability of secondary structures, excision length, etc., were examined, and the parameters of those displaying sizable changes were estimated for plant specific miRNAs. We found most of these features acquired a new set of values or distributions for plant specific miRNAs. While the length of conserved positions (nucleus) in mature miRNAs were relatively longer in plants, the difference in distribution of minimum free energy, between real and background hairpins, was marginal. However, the choice of source (species) of background sequences was found to affect both the minimum free energy and miRNA hairpin stability. The new parameters were tested on an Illumina dataset from maize seedlings, and the results were compared with those obtained using default parameters. The newly parameterized model was found to have much improved specificity and sensitivity over its default counterpart. CONCLUSIONS: In summary, the present study reports behavior of few general and tool-specific statistical features for improving the prediction accuracy of plant miRNAs from deep-sequencing data. BioMed Central 2011-02-16 /pmc/articles/PMC3053258/ /pubmed/21324149 http://dx.doi.org/10.1186/1471-2164-12-108 Text en Copyright ©2011 Thakur et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Thakur, Vivek
Wanchana, Samart
Xu, Mercedes
Bruskiewich, Richard
Quick, William Paul
Mosig, Axel
Zhu, Xin-Guang
Characterization of statistical features for plant microRNA prediction
title Characterization of statistical features for plant microRNA prediction
title_full Characterization of statistical features for plant microRNA prediction
title_fullStr Characterization of statistical features for plant microRNA prediction
title_full_unstemmed Characterization of statistical features for plant microRNA prediction
title_short Characterization of statistical features for plant microRNA prediction
title_sort characterization of statistical features for plant microrna prediction
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3053258/
https://www.ncbi.nlm.nih.gov/pubmed/21324149
http://dx.doi.org/10.1186/1471-2164-12-108
work_keys_str_mv AT thakurvivek characterizationofstatisticalfeaturesforplantmicrornaprediction
AT wanchanasamart characterizationofstatisticalfeaturesforplantmicrornaprediction
AT xumercedes characterizationofstatisticalfeaturesforplantmicrornaprediction
AT bruskiewichrichard characterizationofstatisticalfeaturesforplantmicrornaprediction
AT quickwilliampaul characterizationofstatisticalfeaturesforplantmicrornaprediction
AT mosigaxel characterizationofstatisticalfeaturesforplantmicrornaprediction
AT zhuxinguang characterizationofstatisticalfeaturesforplantmicrornaprediction