Cargando…

Evolutionary binary feature selection using adaptive ebola optimization search algorithm for high-dimensional datasets

Feature selection problem represents the field of study that requires approximate algorithms to identify discriminative and optimally combined features. The evaluation and suitability of these selected features are often analyzed using classifiers. These features are locked with data increasingly be...

Descripción completa

Detalles Bibliográficos
Autores principales: Oyelade, Olaide N., Agushaka, Jeffrey O., Ezugwu, Absalom E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10022820/
https://www.ncbi.nlm.nih.gov/pubmed/36930670
http://dx.doi.org/10.1371/journal.pone.0282812
_version_ 1784908801876426752
author Oyelade, Olaide N.
Agushaka, Jeffrey O.
Ezugwu, Absalom E.
author_facet Oyelade, Olaide N.
Agushaka, Jeffrey O.
Ezugwu, Absalom E.
author_sort Oyelade, Olaide N.
collection PubMed
description Feature selection problem represents the field of study that requires approximate algorithms to identify discriminative and optimally combined features. The evaluation and suitability of these selected features are often analyzed using classifiers. These features are locked with data increasingly being generated from different sources such as social media, surveillance systems, network applications, and medical records. The high dimensionality of these datasets often impairs the quality of the optimal combination of these features selected. The use of the binary optimization method has been proposed in the literature to address this challenge. However, the underlying deficiency of the single binary optimizer is transferred to the quality of the features selected. Though hybrid methods have been proposed, most still suffer from the inherited design limitation of the single combined methods. To address this, we proposed a novel hybrid binary optimization capable of effectively selecting features from increasingly high-dimensional datasets. The approach used in this study designed a sub-population selective mechanism that dynamically assigns individuals to a 2-level optimization process. The level-1 method first mutates items in the population and then reassigns them to a level-2 optimizer. The selective mechanism determines what sub-population is assigned for the level-2 optimizer based on the exploration and exploitation phase of the level-1 optimizer. In addition, we designed nested transfer (NT) functions and investigated the influence of the function on the level-1 optimizer. The binary Ebola optimization search algorithm (BEOSA) is applied for the level-1 mutation, while the simulated annealing (SA) and firefly (FFA) algorithms are investigated for the level-2 optimizer. The outcome of these are the HBEOSA-SA and HBEOSA-FFA, which are then investigated on the NT, and their corresponding variants HBEOSA-SA-NT and HBEOSA-FFA-NT with no NT applied. The hybrid methods were experimentally tested over high-dimensional datasets to address the challenge of feature selection. A comparative analysis was done on the methods to obtain performance variability with the low-dimensional datasets. Results obtained for classification accuracy for large, medium, and small-scale datasets are 0.995 using HBEOSA-FFA, 0.967 using HBEOSA-FFA-NT, and 0.953 using HBEOSA-FFA, respectively. Fitness and cost values relative to large, medium, and small-scale datasets are 0.066 and 0.934 using HBEOSA-FFA, 0.068 and 0.932 using HBEOSA-FFA, with 0.222 and 0.970 using HBEOSA-SA-NT, respectively. Findings from the study indicate that the HBEOSA-SA, HBEOSA-FFA, HBEOSA-SA-NT and HBEOSA-FFA-NT outperformed the BEOSA.
format Online
Article
Text
id pubmed-10022820
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-100228202023-03-18 Evolutionary binary feature selection using adaptive ebola optimization search algorithm for high-dimensional datasets Oyelade, Olaide N. Agushaka, Jeffrey O. Ezugwu, Absalom E. PLoS One Research Article Feature selection problem represents the field of study that requires approximate algorithms to identify discriminative and optimally combined features. The evaluation and suitability of these selected features are often analyzed using classifiers. These features are locked with data increasingly being generated from different sources such as social media, surveillance systems, network applications, and medical records. The high dimensionality of these datasets often impairs the quality of the optimal combination of these features selected. The use of the binary optimization method has been proposed in the literature to address this challenge. However, the underlying deficiency of the single binary optimizer is transferred to the quality of the features selected. Though hybrid methods have been proposed, most still suffer from the inherited design limitation of the single combined methods. To address this, we proposed a novel hybrid binary optimization capable of effectively selecting features from increasingly high-dimensional datasets. The approach used in this study designed a sub-population selective mechanism that dynamically assigns individuals to a 2-level optimization process. The level-1 method first mutates items in the population and then reassigns them to a level-2 optimizer. The selective mechanism determines what sub-population is assigned for the level-2 optimizer based on the exploration and exploitation phase of the level-1 optimizer. In addition, we designed nested transfer (NT) functions and investigated the influence of the function on the level-1 optimizer. The binary Ebola optimization search algorithm (BEOSA) is applied for the level-1 mutation, while the simulated annealing (SA) and firefly (FFA) algorithms are investigated for the level-2 optimizer. The outcome of these are the HBEOSA-SA and HBEOSA-FFA, which are then investigated on the NT, and their corresponding variants HBEOSA-SA-NT and HBEOSA-FFA-NT with no NT applied. The hybrid methods were experimentally tested over high-dimensional datasets to address the challenge of feature selection. A comparative analysis was done on the methods to obtain performance variability with the low-dimensional datasets. Results obtained for classification accuracy for large, medium, and small-scale datasets are 0.995 using HBEOSA-FFA, 0.967 using HBEOSA-FFA-NT, and 0.953 using HBEOSA-FFA, respectively. Fitness and cost values relative to large, medium, and small-scale datasets are 0.066 and 0.934 using HBEOSA-FFA, 0.068 and 0.932 using HBEOSA-FFA, with 0.222 and 0.970 using HBEOSA-SA-NT, respectively. Findings from the study indicate that the HBEOSA-SA, HBEOSA-FFA, HBEOSA-SA-NT and HBEOSA-FFA-NT outperformed the BEOSA. Public Library of Science 2023-03-17 /pmc/articles/PMC10022820/ /pubmed/36930670 http://dx.doi.org/10.1371/journal.pone.0282812 Text en © 2023 Oyelade et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Oyelade, Olaide N.
Agushaka, Jeffrey O.
Ezugwu, Absalom E.
Evolutionary binary feature selection using adaptive ebola optimization search algorithm for high-dimensional datasets
title Evolutionary binary feature selection using adaptive ebola optimization search algorithm for high-dimensional datasets
title_full Evolutionary binary feature selection using adaptive ebola optimization search algorithm for high-dimensional datasets
title_fullStr Evolutionary binary feature selection using adaptive ebola optimization search algorithm for high-dimensional datasets
title_full_unstemmed Evolutionary binary feature selection using adaptive ebola optimization search algorithm for high-dimensional datasets
title_short Evolutionary binary feature selection using adaptive ebola optimization search algorithm for high-dimensional datasets
title_sort evolutionary binary feature selection using adaptive ebola optimization search algorithm for high-dimensional datasets
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10022820/
https://www.ncbi.nlm.nih.gov/pubmed/36930670
http://dx.doi.org/10.1371/journal.pone.0282812
work_keys_str_mv AT oyeladeolaiden evolutionarybinaryfeatureselectionusingadaptiveebolaoptimizationsearchalgorithmforhighdimensionaldatasets
AT agushakajeffreyo evolutionarybinaryfeatureselectionusingadaptiveebolaoptimizationsearchalgorithmforhighdimensionaldatasets
AT ezugwuabsalome evolutionarybinaryfeatureselectionusingadaptiveebolaoptimizationsearchalgorithmforhighdimensionaldatasets