Cargando…

Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets

Classification accuracy achieved by a machine learning technique depends on the feature set used in the learning process. However, it is often found that all the features extracted by some means for a particular task do not contribute to the classification process. Feature selection (FS) is an imper...

Descripción completa

Detalles Bibliográficos
Autores principales: Ahmed, Shameem, Sheikh, Khalid Hassan, Mirjalili, Seyedali, Sarkar, Ram
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier Ltd. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9396289/
https://www.ncbi.nlm.nih.gov/pubmed/36034050
http://dx.doi.org/10.1016/j.eswa.2022.116834
_version_ 1784771897522651136
author Ahmed, Shameem
Sheikh, Khalid Hassan
Mirjalili, Seyedali
Sarkar, Ram
author_facet Ahmed, Shameem
Sheikh, Khalid Hassan
Mirjalili, Seyedali
Sarkar, Ram
author_sort Ahmed, Shameem
collection PubMed
description Classification accuracy achieved by a machine learning technique depends on the feature set used in the learning process. However, it is often found that all the features extracted by some means for a particular task do not contribute to the classification process. Feature selection (FS) is an imperative and challenging pre-processing technique that helps to discard the unnecessary and irrelevant features while reducing the computational time and space requirement and increasing the classification accuracy. Generalized Normal Distribution Optimizer (GNDO), a recently proposed meta-heuristic algorithm, can be used to solve any optimization problem. In this paper, a hybrid version of GNDO with Simulated Annealing (SA) called Binary Simulated Normal Distribution Optimizer (BSNDO) is proposed which uses SA as a local search to achieve higher classification accuracy. The proposed method is evaluated on 18 well-known UCI datasets and compared with its predecessor as well as some popular FS methods. Moreover, this method is tested on high dimensional microarray datasets to prove its worth in real-life datasets. On top of that, it is also applied to a COVID-19 dataset for classification purposes. The obtained results prove the usefulness of BSNDO as a FS method. The source code of this work is publicly available at https://github.com/ahmed-shameem/Feature_selection.
format Online
Article
Text
id pubmed-9396289
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier Ltd.
record_format MEDLINE/PubMed
spelling pubmed-93962892022-08-23 Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets Ahmed, Shameem Sheikh, Khalid Hassan Mirjalili, Seyedali Sarkar, Ram Expert Syst Appl Article Classification accuracy achieved by a machine learning technique depends on the feature set used in the learning process. However, it is often found that all the features extracted by some means for a particular task do not contribute to the classification process. Feature selection (FS) is an imperative and challenging pre-processing technique that helps to discard the unnecessary and irrelevant features while reducing the computational time and space requirement and increasing the classification accuracy. Generalized Normal Distribution Optimizer (GNDO), a recently proposed meta-heuristic algorithm, can be used to solve any optimization problem. In this paper, a hybrid version of GNDO with Simulated Annealing (SA) called Binary Simulated Normal Distribution Optimizer (BSNDO) is proposed which uses SA as a local search to achieve higher classification accuracy. The proposed method is evaluated on 18 well-known UCI datasets and compared with its predecessor as well as some popular FS methods. Moreover, this method is tested on high dimensional microarray datasets to prove its worth in real-life datasets. On top of that, it is also applied to a COVID-19 dataset for classification purposes. The obtained results prove the usefulness of BSNDO as a FS method. The source code of this work is publicly available at https://github.com/ahmed-shameem/Feature_selection. Elsevier Ltd. 2022-08-15 2022-03-15 /pmc/articles/PMC9396289/ /pubmed/36034050 http://dx.doi.org/10.1016/j.eswa.2022.116834 Text en © 2022 Elsevier Ltd. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Ahmed, Shameem
Sheikh, Khalid Hassan
Mirjalili, Seyedali
Sarkar, Ram
Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets
title Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets
title_full Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets
title_fullStr Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets
title_full_unstemmed Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets
title_short Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets
title_sort binary simulated normal distribution optimizer for feature selection: theory and application in covid-19 datasets
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9396289/
https://www.ncbi.nlm.nih.gov/pubmed/36034050
http://dx.doi.org/10.1016/j.eswa.2022.116834
work_keys_str_mv AT ahmedshameem binarysimulatednormaldistributionoptimizerforfeatureselectiontheoryandapplicationincovid19datasets
AT sheikhkhalidhassan binarysimulatednormaldistributionoptimizerforfeatureselectiontheoryandapplicationincovid19datasets
AT mirjaliliseyedali binarysimulatednormaldistributionoptimizerforfeatureselectiontheoryandapplicationincovid19datasets
AT sarkarram binarysimulatednormaldistributionoptimizerforfeatureselectiontheoryandapplicationincovid19datasets