Cargando…

Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm

BACKGROUND: MicroRNAs (miRNAs) are short non-coding RNA molecules, which play an important role in post-transcriptional regulation of gene expression. There have been many efforts to discover miRNA precursors (pre-miRNAs) over the years. Recently, ab initio approaches have attracted more attention b...

Descripción completa

Detalles Bibliográficos
Autores principales: Hsieh, Chih-Hung, Chang, Darby Tien-Hao, Hsueh, Cheng-Hao, Wu, Chi-Yeh, Oyang, Yen-Jen
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009525/
https://www.ncbi.nlm.nih.gov/pubmed/20122227
http://dx.doi.org/10.1186/1471-2105-11-S1-S52
_version_ 1782194698951065600
author Hsieh, Chih-Hung
Chang, Darby Tien-Hao
Hsueh, Cheng-Hao
Wu, Chi-Yeh
Oyang, Yen-Jen
author_facet Hsieh, Chih-Hung
Chang, Darby Tien-Hao
Hsueh, Cheng-Hao
Wu, Chi-Yeh
Oyang, Yen-Jen
author_sort Hsieh, Chih-Hung
collection PubMed
description BACKGROUND: MicroRNAs (miRNAs) are short non-coding RNA molecules, which play an important role in post-transcriptional regulation of gene expression. There have been many efforts to discover miRNA precursors (pre-miRNAs) over the years. Recently, ab initio approaches have attracted more attention because they do not depend on homology information and provide broader applications than comparative approaches. Kernel based classifiers such as support vector machine (SVM) are extensively adopted in these ab initio approaches due to the prediction performance they achieved. On the other hand, logic based classifiers such as decision tree, of which the constructed model is interpretable, have attracted less attention. RESULTS: This article reports the design of a predictor of pre-miRNAs with a novel kernel based classifier named the generalized Gaussian density estimator (G(2)DE) based classifier. The G(2)DE is a kernel based algorithm designed to provide interpretability by utilizing a few but representative kernels for constructing the classification model. The performance of the proposed predictor has been evaluated with 692 human pre-miRNAs and has been compared with two kernel based and two logic based classifiers. The experimental results show that the proposed predictor is capable of achieving prediction performance comparable to those delivered by the prevailing kernel based classification algorithms, while providing the user with an overall picture of the distribution of the data set. CONCLUSION: Software predictors that identify pre-miRNAs in genomic sequences have been exploited by biologists to facilitate molecular biology research in recent years. The G(2)DE employed in this study can deliver prediction accuracy comparable with the state-of-the-art kernel based machine learning algorithms. Furthermore, biologists can obtain valuable insights about the different characteristics of the sequences of pre-miRNAs with the models generated by the G(2)DE based predictor.
format Text
id pubmed-3009525
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30095252010-12-23 Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm Hsieh, Chih-Hung Chang, Darby Tien-Hao Hsueh, Cheng-Hao Wu, Chi-Yeh Oyang, Yen-Jen BMC Bioinformatics Research BACKGROUND: MicroRNAs (miRNAs) are short non-coding RNA molecules, which play an important role in post-transcriptional regulation of gene expression. There have been many efforts to discover miRNA precursors (pre-miRNAs) over the years. Recently, ab initio approaches have attracted more attention because they do not depend on homology information and provide broader applications than comparative approaches. Kernel based classifiers such as support vector machine (SVM) are extensively adopted in these ab initio approaches due to the prediction performance they achieved. On the other hand, logic based classifiers such as decision tree, of which the constructed model is interpretable, have attracted less attention. RESULTS: This article reports the design of a predictor of pre-miRNAs with a novel kernel based classifier named the generalized Gaussian density estimator (G(2)DE) based classifier. The G(2)DE is a kernel based algorithm designed to provide interpretability by utilizing a few but representative kernels for constructing the classification model. The performance of the proposed predictor has been evaluated with 692 human pre-miRNAs and has been compared with two kernel based and two logic based classifiers. The experimental results show that the proposed predictor is capable of achieving prediction performance comparable to those delivered by the prevailing kernel based classification algorithms, while providing the user with an overall picture of the distribution of the data set. CONCLUSION: Software predictors that identify pre-miRNAs in genomic sequences have been exploited by biologists to facilitate molecular biology research in recent years. The G(2)DE employed in this study can deliver prediction accuracy comparable with the state-of-the-art kernel based machine learning algorithms. Furthermore, biologists can obtain valuable insights about the different characteristics of the sequences of pre-miRNAs with the models generated by the G(2)DE based predictor. BioMed Central 2010-01-18 /pmc/articles/PMC3009525/ /pubmed/20122227 http://dx.doi.org/10.1186/1471-2105-11-S1-S52 Text en Copyright ©2010 Hsieh et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Hsieh, Chih-Hung
Chang, Darby Tien-Hao
Hsueh, Cheng-Hao
Wu, Chi-Yeh
Oyang, Yen-Jen
Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm
title Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm
title_full Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm
title_fullStr Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm
title_full_unstemmed Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm
title_short Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm
title_sort predicting microrna precursors with a generalized gaussian components based density estimation algorithm
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009525/
https://www.ncbi.nlm.nih.gov/pubmed/20122227
http://dx.doi.org/10.1186/1471-2105-11-S1-S52
work_keys_str_mv AT hsiehchihhung predictingmicrornaprecursorswithageneralizedgaussiancomponentsbaseddensityestimationalgorithm
AT changdarbytienhao predictingmicrornaprecursorswithageneralizedgaussiancomponentsbaseddensityestimationalgorithm
AT hsuehchenghao predictingmicrornaprecursorswithageneralizedgaussiancomponentsbaseddensityestimationalgorithm
AT wuchiyeh predictingmicrornaprecursorswithageneralizedgaussiancomponentsbaseddensityestimationalgorithm
AT oyangyenjen predictingmicrornaprecursorswithageneralizedgaussiancomponentsbaseddensityestimationalgorithm