Cargando…

Opposition-based sine cosine optimizer utilizing refraction learning and variable neighborhood search for feature selection

This paper proposes new improved binary versions of the Sine Cosine Algorithm (SCA) for the Feature Selection (FS) problem. FS is an essential machine learning and data mining task of choosing a subset of highly discriminating features from noisy, irrelevant, high-dimensional, and redundant features...

Descripción completa

Detalles Bibliográficos
Autores principales: Abed-alguni, Bilal H., Alawad, Noor Aldeen, Al-Betar, Mohammed Azmi, Paul, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9547101/
https://www.ncbi.nlm.nih.gov/pubmed/36247211
http://dx.doi.org/10.1007/s10489-022-04201-z
_version_ 1784805190912704512
author Abed-alguni, Bilal H.
Alawad, Noor Aldeen
Al-Betar, Mohammed Azmi
Paul, David
author_facet Abed-alguni, Bilal H.
Alawad, Noor Aldeen
Al-Betar, Mohammed Azmi
Paul, David
author_sort Abed-alguni, Bilal H.
collection PubMed
description This paper proposes new improved binary versions of the Sine Cosine Algorithm (SCA) for the Feature Selection (FS) problem. FS is an essential machine learning and data mining task of choosing a subset of highly discriminating features from noisy, irrelevant, high-dimensional, and redundant features to best represent a dataset. SCA is a recent metaheuristic algorithm established to emulate a model based on sine and cosine trigonometric functions. It was initially proposed to tackle problems in the continuous domain. The SCA has been modified to Binary SCA (BSCA) to deal with the binary domain of the FS problem. To improve the performance of BSCA, three accumulative improved variations are proposed (i.e., IBSCA1, IBSCA2, and IBSCA3) where the last version has the best performance. IBSCA1 employs Opposition Based Learning (OBL) to help ensure a diverse population of candidate solutions. IBSCA2 improves IBSCA1 by adding Variable Neighborhood Search (VNS) and Laplace distribution to support several mutation methods. IBSCA3 improves IBSCA2 by optimizing the best candidate solution using Refraction Learning (RL), a novel OBL approach based on light refraction. For performance evaluation, 19 real-wold datasets, including a COVID-19 dataset, were selected with different numbers of features, classes, and instances. Three performance measurements have been used to test the IBSCA versions: classification accuracy, number of features, and fitness values. Furthermore, the performance of the last variation of IBSCA3 is compared against 28 existing popular algorithms. Interestingly, IBCSA3 outperformed almost all comparative methods in terms of classification accuracy and fitness values. At the same time, it was ranked 15 out of 19 in terms of number of features. The overall simulation and statistical results indicate that IBSCA3 performs better than the other algorithms.
format Online
Article
Text
id pubmed-9547101
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-95471012022-10-11 Opposition-based sine cosine optimizer utilizing refraction learning and variable neighborhood search for feature selection Abed-alguni, Bilal H. Alawad, Noor Aldeen Al-Betar, Mohammed Azmi Paul, David Appl Intell (Dordr) Article This paper proposes new improved binary versions of the Sine Cosine Algorithm (SCA) for the Feature Selection (FS) problem. FS is an essential machine learning and data mining task of choosing a subset of highly discriminating features from noisy, irrelevant, high-dimensional, and redundant features to best represent a dataset. SCA is a recent metaheuristic algorithm established to emulate a model based on sine and cosine trigonometric functions. It was initially proposed to tackle problems in the continuous domain. The SCA has been modified to Binary SCA (BSCA) to deal with the binary domain of the FS problem. To improve the performance of BSCA, three accumulative improved variations are proposed (i.e., IBSCA1, IBSCA2, and IBSCA3) where the last version has the best performance. IBSCA1 employs Opposition Based Learning (OBL) to help ensure a diverse population of candidate solutions. IBSCA2 improves IBSCA1 by adding Variable Neighborhood Search (VNS) and Laplace distribution to support several mutation methods. IBSCA3 improves IBSCA2 by optimizing the best candidate solution using Refraction Learning (RL), a novel OBL approach based on light refraction. For performance evaluation, 19 real-wold datasets, including a COVID-19 dataset, were selected with different numbers of features, classes, and instances. Three performance measurements have been used to test the IBSCA versions: classification accuracy, number of features, and fitness values. Furthermore, the performance of the last variation of IBSCA3 is compared against 28 existing popular algorithms. Interestingly, IBCSA3 outperformed almost all comparative methods in terms of classification accuracy and fitness values. At the same time, it was ranked 15 out of 19 in terms of number of features. The overall simulation and statistical results indicate that IBSCA3 performs better than the other algorithms. Springer US 2022-10-08 2023 /pmc/articles/PMC9547101/ /pubmed/36247211 http://dx.doi.org/10.1007/s10489-022-04201-z Text en © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022. Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Abed-alguni, Bilal H.
Alawad, Noor Aldeen
Al-Betar, Mohammed Azmi
Paul, David
Opposition-based sine cosine optimizer utilizing refraction learning and variable neighborhood search for feature selection
title Opposition-based sine cosine optimizer utilizing refraction learning and variable neighborhood search for feature selection
title_full Opposition-based sine cosine optimizer utilizing refraction learning and variable neighborhood search for feature selection
title_fullStr Opposition-based sine cosine optimizer utilizing refraction learning and variable neighborhood search for feature selection
title_full_unstemmed Opposition-based sine cosine optimizer utilizing refraction learning and variable neighborhood search for feature selection
title_short Opposition-based sine cosine optimizer utilizing refraction learning and variable neighborhood search for feature selection
title_sort opposition-based sine cosine optimizer utilizing refraction learning and variable neighborhood search for feature selection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9547101/
https://www.ncbi.nlm.nih.gov/pubmed/36247211
http://dx.doi.org/10.1007/s10489-022-04201-z
work_keys_str_mv AT abedalgunibilalh oppositionbasedsinecosineoptimizerutilizingrefractionlearningandvariableneighborhoodsearchforfeatureselection
AT alawadnooraldeen oppositionbasedsinecosineoptimizerutilizingrefractionlearningandvariableneighborhoodsearchforfeatureselection
AT albetarmohammedazmi oppositionbasedsinecosineoptimizerutilizingrefractionlearningandvariableneighborhoodsearchforfeatureselection
AT pauldavid oppositionbasedsinecosineoptimizerutilizingrefractionlearningandvariableneighborhoodsearchforfeatureselection