Cargando…
Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets
Classification accuracy achieved by a machine learning technique depends on the feature set used in the learning process. However, it is often found that all the features extracted by some means for a particular task do not contribute to the classification process. Feature selection (FS) is an imper...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier Ltd.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9396289/ https://www.ncbi.nlm.nih.gov/pubmed/36034050 http://dx.doi.org/10.1016/j.eswa.2022.116834 |
_version_ | 1784771897522651136 |
---|---|
author | Ahmed, Shameem Sheikh, Khalid Hassan Mirjalili, Seyedali Sarkar, Ram |
author_facet | Ahmed, Shameem Sheikh, Khalid Hassan Mirjalili, Seyedali Sarkar, Ram |
author_sort | Ahmed, Shameem |
collection | PubMed |
description | Classification accuracy achieved by a machine learning technique depends on the feature set used in the learning process. However, it is often found that all the features extracted by some means for a particular task do not contribute to the classification process. Feature selection (FS) is an imperative and challenging pre-processing technique that helps to discard the unnecessary and irrelevant features while reducing the computational time and space requirement and increasing the classification accuracy. Generalized Normal Distribution Optimizer (GNDO), a recently proposed meta-heuristic algorithm, can be used to solve any optimization problem. In this paper, a hybrid version of GNDO with Simulated Annealing (SA) called Binary Simulated Normal Distribution Optimizer (BSNDO) is proposed which uses SA as a local search to achieve higher classification accuracy. The proposed method is evaluated on 18 well-known UCI datasets and compared with its predecessor as well as some popular FS methods. Moreover, this method is tested on high dimensional microarray datasets to prove its worth in real-life datasets. On top of that, it is also applied to a COVID-19 dataset for classification purposes. The obtained results prove the usefulness of BSNDO as a FS method. The source code of this work is publicly available at https://github.com/ahmed-shameem/Feature_selection. |
format | Online Article Text |
id | pubmed-9396289 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Elsevier Ltd. |
record_format | MEDLINE/PubMed |
spelling | pubmed-93962892022-08-23 Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets Ahmed, Shameem Sheikh, Khalid Hassan Mirjalili, Seyedali Sarkar, Ram Expert Syst Appl Article Classification accuracy achieved by a machine learning technique depends on the feature set used in the learning process. However, it is often found that all the features extracted by some means for a particular task do not contribute to the classification process. Feature selection (FS) is an imperative and challenging pre-processing technique that helps to discard the unnecessary and irrelevant features while reducing the computational time and space requirement and increasing the classification accuracy. Generalized Normal Distribution Optimizer (GNDO), a recently proposed meta-heuristic algorithm, can be used to solve any optimization problem. In this paper, a hybrid version of GNDO with Simulated Annealing (SA) called Binary Simulated Normal Distribution Optimizer (BSNDO) is proposed which uses SA as a local search to achieve higher classification accuracy. The proposed method is evaluated on 18 well-known UCI datasets and compared with its predecessor as well as some popular FS methods. Moreover, this method is tested on high dimensional microarray datasets to prove its worth in real-life datasets. On top of that, it is also applied to a COVID-19 dataset for classification purposes. The obtained results prove the usefulness of BSNDO as a FS method. The source code of this work is publicly available at https://github.com/ahmed-shameem/Feature_selection. Elsevier Ltd. 2022-08-15 2022-03-15 /pmc/articles/PMC9396289/ /pubmed/36034050 http://dx.doi.org/10.1016/j.eswa.2022.116834 Text en © 2022 Elsevier Ltd. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Ahmed, Shameem Sheikh, Khalid Hassan Mirjalili, Seyedali Sarkar, Ram Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets |
title | Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets |
title_full | Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets |
title_fullStr | Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets |
title_full_unstemmed | Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets |
title_short | Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets |
title_sort | binary simulated normal distribution optimizer for feature selection: theory and application in covid-19 datasets |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9396289/ https://www.ncbi.nlm.nih.gov/pubmed/36034050 http://dx.doi.org/10.1016/j.eswa.2022.116834 |
work_keys_str_mv | AT ahmedshameem binarysimulatednormaldistributionoptimizerforfeatureselectiontheoryandapplicationincovid19datasets AT sheikhkhalidhassan binarysimulatednormaldistributionoptimizerforfeatureselectiontheoryandapplicationincovid19datasets AT mirjaliliseyedali binarysimulatednormaldistributionoptimizerforfeatureselectiontheoryandapplicationincovid19datasets AT sarkarram binarysimulatednormaldistributionoptimizerforfeatureselectiontheoryandapplicationincovid19datasets |