Cargando…
A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities
Specialized data preparation techniques, ranging from data cleaning, outlier detection, missing value imputation, feature selection (FS), amongst others, are procedures required to get the most out of data and, consequently, get the optimal performance of predictive models for classification tasks....
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer London
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8361413/ https://www.ncbi.nlm.nih.gov/pubmed/34404964 http://dx.doi.org/10.1007/s00521-021-06406-8 |
_version_ | 1783737950011392000 |
---|---|
author | Abiodun, Esther Omolara Alabdulatif, Abdulatif Abiodun, Oludare Isaac Alawida, Moatsum Alabdulatif, Abdullah Alkhawaldeh, Rami S. |
author_facet | Abiodun, Esther Omolara Alabdulatif, Abdulatif Abiodun, Oludare Isaac Alawida, Moatsum Alabdulatif, Abdullah Alkhawaldeh, Rami S. |
author_sort | Abiodun, Esther Omolara |
collection | PubMed |
description | Specialized data preparation techniques, ranging from data cleaning, outlier detection, missing value imputation, feature selection (FS), amongst others, are procedures required to get the most out of data and, consequently, get the optimal performance of predictive models for classification tasks. FS is a vital and indispensable technique that enables the model to perform faster, eliminate noisy data, remove redundancy, reduce overfitting, improve precision and increase generalization on testing data. While conventional FS techniques have been leveraged for classification tasks in the past few decades, they fail to optimally reduce the high dimensionality of the feature space of texts, thus breeding inefficient predictive models. Emerging technologies such as the metaheuristics and hyper-heuristics optimization methods provide a new paradigm for FS due to their efficiency in improving the accuracy of classification, computational demands, storage, as well as functioning seamlessly in solving complex optimization problems with less time. However, little details are known on best practices for case-to-case usage of emerging FS methods. The literature continues to be engulfed with clear and unclear findings in leveraging effective methods, which, if not performed accurately, alters precision, real-world-use feasibility, and the predictive model's overall performance. This paper reviews the present state of FS with respect to metaheuristics and hyper-heuristic methods. Through a systematic literature review of over 200 articles, we set out the most recent findings and trends to enlighten analysts, practitioners and researchers in the field of data analytics seeking clarity in understanding and implementing effective FS optimization methods for improved text classification tasks. |
format | Online Article Text |
id | pubmed-8361413 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Springer London |
record_format | MEDLINE/PubMed |
spelling | pubmed-83614132021-08-13 A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities Abiodun, Esther Omolara Alabdulatif, Abdulatif Abiodun, Oludare Isaac Alawida, Moatsum Alabdulatif, Abdullah Alkhawaldeh, Rami S. Neural Comput Appl Review Article Specialized data preparation techniques, ranging from data cleaning, outlier detection, missing value imputation, feature selection (FS), amongst others, are procedures required to get the most out of data and, consequently, get the optimal performance of predictive models for classification tasks. FS is a vital and indispensable technique that enables the model to perform faster, eliminate noisy data, remove redundancy, reduce overfitting, improve precision and increase generalization on testing data. While conventional FS techniques have been leveraged for classification tasks in the past few decades, they fail to optimally reduce the high dimensionality of the feature space of texts, thus breeding inefficient predictive models. Emerging technologies such as the metaheuristics and hyper-heuristics optimization methods provide a new paradigm for FS due to their efficiency in improving the accuracy of classification, computational demands, storage, as well as functioning seamlessly in solving complex optimization problems with less time. However, little details are known on best practices for case-to-case usage of emerging FS methods. The literature continues to be engulfed with clear and unclear findings in leveraging effective methods, which, if not performed accurately, alters precision, real-world-use feasibility, and the predictive model's overall performance. This paper reviews the present state of FS with respect to metaheuristics and hyper-heuristic methods. Through a systematic literature review of over 200 articles, we set out the most recent findings and trends to enlighten analysts, practitioners and researchers in the field of data analytics seeking clarity in understanding and implementing effective FS optimization methods for improved text classification tasks. Springer London 2021-08-13 2021 /pmc/articles/PMC8361413/ /pubmed/34404964 http://dx.doi.org/10.1007/s00521-021-06406-8 Text en © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Review Article Abiodun, Esther Omolara Alabdulatif, Abdulatif Abiodun, Oludare Isaac Alawida, Moatsum Alabdulatif, Abdullah Alkhawaldeh, Rami S. A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities |
title | A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities |
title_full | A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities |
title_fullStr | A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities |
title_full_unstemmed | A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities |
title_short | A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities |
title_sort | systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities |
topic | Review Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8361413/ https://www.ncbi.nlm.nih.gov/pubmed/34404964 http://dx.doi.org/10.1007/s00521-021-06406-8 |
work_keys_str_mv | AT abiodunestheromolara asystematicreviewofemergingfeatureselectionoptimizationmethodsforoptimaltextclassificationthepresentstateandprospectiveopportunities AT alabdulatifabdulatif asystematicreviewofemergingfeatureselectionoptimizationmethodsforoptimaltextclassificationthepresentstateandprospectiveopportunities AT abiodunoludareisaac asystematicreviewofemergingfeatureselectionoptimizationmethodsforoptimaltextclassificationthepresentstateandprospectiveopportunities AT alawidamoatsum asystematicreviewofemergingfeatureselectionoptimizationmethodsforoptimaltextclassificationthepresentstateandprospectiveopportunities AT alabdulatifabdullah asystematicreviewofemergingfeatureselectionoptimizationmethodsforoptimaltextclassificationthepresentstateandprospectiveopportunities AT alkhawaldehramis asystematicreviewofemergingfeatureselectionoptimizationmethodsforoptimaltextclassificationthepresentstateandprospectiveopportunities AT abiodunestheromolara systematicreviewofemergingfeatureselectionoptimizationmethodsforoptimaltextclassificationthepresentstateandprospectiveopportunities AT alabdulatifabdulatif systematicreviewofemergingfeatureselectionoptimizationmethodsforoptimaltextclassificationthepresentstateandprospectiveopportunities AT abiodunoludareisaac systematicreviewofemergingfeatureselectionoptimizationmethodsforoptimaltextclassificationthepresentstateandprospectiveopportunities AT alawidamoatsum systematicreviewofemergingfeatureselectionoptimizationmethodsforoptimaltextclassificationthepresentstateandprospectiveopportunities AT alabdulatifabdullah systematicreviewofemergingfeatureselectionoptimizationmethodsforoptimaltextclassificationthepresentstateandprospectiveopportunities AT alkhawaldehramis systematicreviewofemergingfeatureselectionoptimizationmethodsforoptimaltextclassificationthepresentstateandprospectiveopportunities |