Cargando…
HuntMi: an efficient and taxon-specific approach in pre-miRNA identification
BACKGROUND: Machine learning techniques are known to be a powerful way of distinguishing microRNA hairpins from pseudo hairpins and have been applied in a number of recognised miRNA search tools. However, many current methods based on machine learning suffer from some drawbacks, including not addres...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3686668/ https://www.ncbi.nlm.nih.gov/pubmed/23497112 http://dx.doi.org/10.1186/1471-2105-14-83 |
_version_ | 1782273811199033344 |
---|---|
author | Gudyś, Adam Szcześniak, Michał Wojciech Sikora, Marek Makałowska, Izabela |
author_facet | Gudyś, Adam Szcześniak, Michał Wojciech Sikora, Marek Makałowska, Izabela |
author_sort | Gudyś, Adam |
collection | PubMed |
description | BACKGROUND: Machine learning techniques are known to be a powerful way of distinguishing microRNA hairpins from pseudo hairpins and have been applied in a number of recognised miRNA search tools. However, many current methods based on machine learning suffer from some drawbacks, including not addressing the class imbalance problem properly. It may lead to overlearning the majority class and/or incorrect assessment of classification performance. Moreover, those tools are effective for a narrow range of species, usually the model ones. This study aims at improving performance of miRNA classification procedure, extending its usability and reducing computational time. RESULTS: We present HuntMi, a stand-alone machine learning miRNA classification tool. We developed a novel method of dealing with the class imbalance problem called ROC-select, which is based on thresholding score function produced by traditional classifiers. We also introduced new features to the data representation. Several classification algorithms in combination with ROC-select were tested and random forest was selected for the best balance between sensitivity and specificity. Reliable assessment of classification performance is guaranteed by using large, strongly imbalanced, and taxon-specific datasets in 10-fold cross-validation procedure. As a result, HuntMi achieves a considerably better performance than any other miRNA classification tool and can be applied in miRNA search experiments in a wide range of species. CONCLUSIONS: Our results indicate that HuntMi represents an effective and flexible tool for identification of new microRNAs in animals, plants and viruses. ROC-select strategy proves to be superior to other methods of dealing with class imbalance problem and can possibly be used in other machine learning classification tasks. The HuntMi software as well as datasets used in the research are freely available at http://lemur.amu.edu.pl/share/HuntMi/. |
format | Online Article Text |
id | pubmed-3686668 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-36866682013-06-25 HuntMi: an efficient and taxon-specific approach in pre-miRNA identification Gudyś, Adam Szcześniak, Michał Wojciech Sikora, Marek Makałowska, Izabela BMC Bioinformatics Methodology Article BACKGROUND: Machine learning techniques are known to be a powerful way of distinguishing microRNA hairpins from pseudo hairpins and have been applied in a number of recognised miRNA search tools. However, many current methods based on machine learning suffer from some drawbacks, including not addressing the class imbalance problem properly. It may lead to overlearning the majority class and/or incorrect assessment of classification performance. Moreover, those tools are effective for a narrow range of species, usually the model ones. This study aims at improving performance of miRNA classification procedure, extending its usability and reducing computational time. RESULTS: We present HuntMi, a stand-alone machine learning miRNA classification tool. We developed a novel method of dealing with the class imbalance problem called ROC-select, which is based on thresholding score function produced by traditional classifiers. We also introduced new features to the data representation. Several classification algorithms in combination with ROC-select were tested and random forest was selected for the best balance between sensitivity and specificity. Reliable assessment of classification performance is guaranteed by using large, strongly imbalanced, and taxon-specific datasets in 10-fold cross-validation procedure. As a result, HuntMi achieves a considerably better performance than any other miRNA classification tool and can be applied in miRNA search experiments in a wide range of species. CONCLUSIONS: Our results indicate that HuntMi represents an effective and flexible tool for identification of new microRNAs in animals, plants and viruses. ROC-select strategy proves to be superior to other methods of dealing with class imbalance problem and can possibly be used in other machine learning classification tasks. The HuntMi software as well as datasets used in the research are freely available at http://lemur.amu.edu.pl/share/HuntMi/. BioMed Central 2013-03-05 /pmc/articles/PMC3686668/ /pubmed/23497112 http://dx.doi.org/10.1186/1471-2105-14-83 Text en Copyright © 2013 Gudyśet al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Gudyś, Adam Szcześniak, Michał Wojciech Sikora, Marek Makałowska, Izabela HuntMi: an efficient and taxon-specific approach in pre-miRNA identification |
title | HuntMi: an efficient and taxon-specific approach in pre-miRNA identification |
title_full | HuntMi: an efficient and taxon-specific approach in pre-miRNA identification |
title_fullStr | HuntMi: an efficient and taxon-specific approach in pre-miRNA identification |
title_full_unstemmed | HuntMi: an efficient and taxon-specific approach in pre-miRNA identification |
title_short | HuntMi: an efficient and taxon-specific approach in pre-miRNA identification |
title_sort | huntmi: an efficient and taxon-specific approach in pre-mirna identification |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3686668/ https://www.ncbi.nlm.nih.gov/pubmed/23497112 http://dx.doi.org/10.1186/1471-2105-14-83 |
work_keys_str_mv | AT gudysadam huntmianefficientandtaxonspecificapproachinpremirnaidentification AT szczesniakmichałwojciech huntmianefficientandtaxonspecificapproachinpremirnaidentification AT sikoramarek huntmianefficientandtaxonspecificapproachinpremirnaidentification AT makałowskaizabela huntmianefficientandtaxonspecificapproachinpremirnaidentification |