Cargando…
Monte Carlo Tree Search-Based Recursive Algorithm for Feature Selection in High-Dimensional Datasets
The complexity and high dimensionality are the inherent concerns of big data. The role of feature selection has gained prime importance to cope with the issue by reducing dimensionality of datasets. The compromise between the maximum classification accuracy and the minimum dimensions is as yet an un...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7597188/ https://www.ncbi.nlm.nih.gov/pubmed/33286862 http://dx.doi.org/10.3390/e22101093 |
_version_ | 1783602286392508416 |
---|---|
author | Chaudhry, Muhammad Umar Yasir, Muhammad Asghar, Muhammad Nabeel Lee, Jee-Hyong |
author_facet | Chaudhry, Muhammad Umar Yasir, Muhammad Asghar, Muhammad Nabeel Lee, Jee-Hyong |
author_sort | Chaudhry, Muhammad Umar |
collection | PubMed |
description | The complexity and high dimensionality are the inherent concerns of big data. The role of feature selection has gained prime importance to cope with the issue by reducing dimensionality of datasets. The compromise between the maximum classification accuracy and the minimum dimensions is as yet an unsolved puzzle. Recently, Monte Carlo Tree Search (MCTS)-based techniques have been invented that have attained great success in feature selection by constructing a binary feature selection tree and efficiently focusing on the most valuable features in the features space. However, one challenging problem associated with such approaches is a tradeoff between the tree search and the number of simulations. In a limited number of simulations, the tree might not meet the sufficient depth, thus inducing biasness towards randomness in feature subset selection. In this paper, a new algorithm for feature selection is proposed where multiple feature selection trees are built iteratively in a recursive fashion. The state space of every successor feature selection tree is less than its predecessor, thus increasing the impact of tree search in selecting best features, keeping the MCTS simulations fixed. In this study, experiments are performed on 16 benchmark datasets for validation purposes. We also compare the performance with state-of-the-art methods in literature both in terms of classification accuracy and the feature selection ratio. |
format | Online Article Text |
id | pubmed-7597188 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-75971882020-11-09 Monte Carlo Tree Search-Based Recursive Algorithm for Feature Selection in High-Dimensional Datasets Chaudhry, Muhammad Umar Yasir, Muhammad Asghar, Muhammad Nabeel Lee, Jee-Hyong Entropy (Basel) Article The complexity and high dimensionality are the inherent concerns of big data. The role of feature selection has gained prime importance to cope with the issue by reducing dimensionality of datasets. The compromise between the maximum classification accuracy and the minimum dimensions is as yet an unsolved puzzle. Recently, Monte Carlo Tree Search (MCTS)-based techniques have been invented that have attained great success in feature selection by constructing a binary feature selection tree and efficiently focusing on the most valuable features in the features space. However, one challenging problem associated with such approaches is a tradeoff between the tree search and the number of simulations. In a limited number of simulations, the tree might not meet the sufficient depth, thus inducing biasness towards randomness in feature subset selection. In this paper, a new algorithm for feature selection is proposed where multiple feature selection trees are built iteratively in a recursive fashion. The state space of every successor feature selection tree is less than its predecessor, thus increasing the impact of tree search in selecting best features, keeping the MCTS simulations fixed. In this study, experiments are performed on 16 benchmark datasets for validation purposes. We also compare the performance with state-of-the-art methods in literature both in terms of classification accuracy and the feature selection ratio. MDPI 2020-09-29 /pmc/articles/PMC7597188/ /pubmed/33286862 http://dx.doi.org/10.3390/e22101093 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Chaudhry, Muhammad Umar Yasir, Muhammad Asghar, Muhammad Nabeel Lee, Jee-Hyong Monte Carlo Tree Search-Based Recursive Algorithm for Feature Selection in High-Dimensional Datasets |
title | Monte Carlo Tree Search-Based Recursive Algorithm for Feature Selection in High-Dimensional Datasets |
title_full | Monte Carlo Tree Search-Based Recursive Algorithm for Feature Selection in High-Dimensional Datasets |
title_fullStr | Monte Carlo Tree Search-Based Recursive Algorithm for Feature Selection in High-Dimensional Datasets |
title_full_unstemmed | Monte Carlo Tree Search-Based Recursive Algorithm for Feature Selection in High-Dimensional Datasets |
title_short | Monte Carlo Tree Search-Based Recursive Algorithm for Feature Selection in High-Dimensional Datasets |
title_sort | monte carlo tree search-based recursive algorithm for feature selection in high-dimensional datasets |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7597188/ https://www.ncbi.nlm.nih.gov/pubmed/33286862 http://dx.doi.org/10.3390/e22101093 |
work_keys_str_mv | AT chaudhrymuhammadumar montecarlotreesearchbasedrecursivealgorithmforfeatureselectioninhighdimensionaldatasets AT yasirmuhammad montecarlotreesearchbasedrecursivealgorithmforfeatureselectioninhighdimensionaldatasets AT asgharmuhammadnabeel montecarlotreesearchbasedrecursivealgorithmforfeatureselectioninhighdimensionaldatasets AT leejeehyong montecarlotreesearchbasedrecursivealgorithmforfeatureselectioninhighdimensionaldatasets |