Cargando…

Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants

MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and i...

Descripción completa

Detalles Bibliográficos
Autores principales: Yousef, Malik, Saçar Demirci, Müşerref Duygu, Khalifa, Waleed, Allmer, Jens
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4844869/
https://www.ncbi.nlm.nih.gov/pubmed/27190509
http://dx.doi.org/10.1155/2016/5670851
_version_ 1782428831358910464
author Yousef, Malik
Saçar Demirci, Müşerref Duygu
Khalifa, Waleed
Allmer, Jens
author_facet Yousef, Malik
Saçar Demirci, Müşerref Duygu
Khalifa, Waleed
Allmer, Jens
author_sort Yousef, Malik
collection PubMed
description MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of ~95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection.
format Online
Article
Text
id pubmed-4844869
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-48448692016-05-17 Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants Yousef, Malik Saçar Demirci, Müşerref Duygu Khalifa, Waleed Allmer, Jens Adv Bioinformatics Research Article MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of ~95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection. Hindawi Publishing Corporation 2016 2016-04-12 /pmc/articles/PMC4844869/ /pubmed/27190509 http://dx.doi.org/10.1155/2016/5670851 Text en Copyright © 2016 Malik Yousef et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Yousef, Malik
Saçar Demirci, Müşerref Duygu
Khalifa, Waleed
Allmer, Jens
Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants
title Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants
title_full Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants
title_fullStr Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants
title_full_unstemmed Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants
title_short Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants
title_sort feature selection has a large impact on one-class classification accuracy for micrornas in plants
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4844869/
https://www.ncbi.nlm.nih.gov/pubmed/27190509
http://dx.doi.org/10.1155/2016/5670851
work_keys_str_mv AT yousefmalik featureselectionhasalargeimpactononeclassclassificationaccuracyformicrornasinplants
AT sacardemircimuserrefduygu featureselectionhasalargeimpactononeclassclassificationaccuracyformicrornasinplants
AT khalifawaleed featureselectionhasalargeimpactononeclassclassificationaccuracyformicrornasinplants
AT allmerjens featureselectionhasalargeimpactononeclassclassificationaccuracyformicrornasinplants