Cargando…
Integrated Strategy Improves the Prediction Accuracy of miRNA in Large Dataset
MiRNAs are short non-coding RNAs of about 22 nucleotides, which play critical roles in gene expression regulation. The biogenesis of miRNAs is largely determined by the sequence and structural features of their parental RNA molecules. Based on these features, multiple computational tools have been d...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5176297/ https://www.ncbi.nlm.nih.gov/pubmed/28002428 http://dx.doi.org/10.1371/journal.pone.0168392 |
_version_ | 1782484796416458752 |
---|---|
author | Xue, Bin Lipps, David Devineni, Sree |
author_facet | Xue, Bin Lipps, David Devineni, Sree |
author_sort | Xue, Bin |
collection | PubMed |
description | MiRNAs are short non-coding RNAs of about 22 nucleotides, which play critical roles in gene expression regulation. The biogenesis of miRNAs is largely determined by the sequence and structural features of their parental RNA molecules. Based on these features, multiple computational tools have been developed to predict if RNA transcripts contain miRNAs or not. Although being very successful, these predictors started to face multiple challenges in recent years. Many predictors were optimized using datasets of hundreds of miRNA samples. The sizes of these datasets are much smaller than the number of known miRNAs. Consequently, the prediction accuracy of these predictors in large dataset becomes unknown and needs to be re-tested. In addition, many predictors were optimized for either high sensitivity or high specificity. These optimization strategies may bring in serious limitations in applications. Moreover, to meet continuously raised expectations on these computational tools, improving the prediction accuracy becomes extremely important. In this study, a meta-predictor mirMeta was developed by integrating a set of non-linear transformations with meta-strategy. More specifically, the outputs of five individual predictors were first preprocessed using non-linear transformations, and then fed into an artificial neural network to make the meta-prediction. The prediction accuracy of meta-predictor was validated using both multi-fold cross-validation and independent dataset. The final accuracy of meta-predictor in newly-designed large dataset is improved by 7% to 93%. The meta-predictor is also proved to be less dependent on datasets, as well as has refined balance between sensitivity and specificity. This study has two folds of importance: First, it shows that the combination of non-linear transformations and artificial neural networks improves the prediction accuracy of individual predictors. Second, a new miRNA predictor with significantly improved prediction accuracy is developed for the community for identifying novel miRNAs and the complete set of miRNAs. Source code is available at: https://github.com/xueLab/mirMeta |
format | Online Article Text |
id | pubmed-5176297 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-51762972017-01-04 Integrated Strategy Improves the Prediction Accuracy of miRNA in Large Dataset Xue, Bin Lipps, David Devineni, Sree PLoS One Research Article MiRNAs are short non-coding RNAs of about 22 nucleotides, which play critical roles in gene expression regulation. The biogenesis of miRNAs is largely determined by the sequence and structural features of their parental RNA molecules. Based on these features, multiple computational tools have been developed to predict if RNA transcripts contain miRNAs or not. Although being very successful, these predictors started to face multiple challenges in recent years. Many predictors were optimized using datasets of hundreds of miRNA samples. The sizes of these datasets are much smaller than the number of known miRNAs. Consequently, the prediction accuracy of these predictors in large dataset becomes unknown and needs to be re-tested. In addition, many predictors were optimized for either high sensitivity or high specificity. These optimization strategies may bring in serious limitations in applications. Moreover, to meet continuously raised expectations on these computational tools, improving the prediction accuracy becomes extremely important. In this study, a meta-predictor mirMeta was developed by integrating a set of non-linear transformations with meta-strategy. More specifically, the outputs of five individual predictors were first preprocessed using non-linear transformations, and then fed into an artificial neural network to make the meta-prediction. The prediction accuracy of meta-predictor was validated using both multi-fold cross-validation and independent dataset. The final accuracy of meta-predictor in newly-designed large dataset is improved by 7% to 93%. The meta-predictor is also proved to be less dependent on datasets, as well as has refined balance between sensitivity and specificity. This study has two folds of importance: First, it shows that the combination of non-linear transformations and artificial neural networks improves the prediction accuracy of individual predictors. Second, a new miRNA predictor with significantly improved prediction accuracy is developed for the community for identifying novel miRNAs and the complete set of miRNAs. Source code is available at: https://github.com/xueLab/mirMeta Public Library of Science 2016-12-21 /pmc/articles/PMC5176297/ /pubmed/28002428 http://dx.doi.org/10.1371/journal.pone.0168392 Text en © 2016 Xue et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Xue, Bin Lipps, David Devineni, Sree Integrated Strategy Improves the Prediction Accuracy of miRNA in Large Dataset |
title | Integrated Strategy Improves the Prediction Accuracy of miRNA in Large Dataset |
title_full | Integrated Strategy Improves the Prediction Accuracy of miRNA in Large Dataset |
title_fullStr | Integrated Strategy Improves the Prediction Accuracy of miRNA in Large Dataset |
title_full_unstemmed | Integrated Strategy Improves the Prediction Accuracy of miRNA in Large Dataset |
title_short | Integrated Strategy Improves the Prediction Accuracy of miRNA in Large Dataset |
title_sort | integrated strategy improves the prediction accuracy of mirna in large dataset |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5176297/ https://www.ncbi.nlm.nih.gov/pubmed/28002428 http://dx.doi.org/10.1371/journal.pone.0168392 |
work_keys_str_mv | AT xuebin integratedstrategyimprovesthepredictionaccuracyofmirnainlargedataset AT lippsdavid integratedstrategyimprovesthepredictionaccuracyofmirnainlargedataset AT devinenisree integratedstrategyimprovesthepredictionaccuracyofmirnainlargedataset |