Cargando…

Feature Selection via Chaotic Antlion Optimization

BACKGROUND: Selecting a subset of relevant properties from a large set of features that describe a dataset is a challenging machine learning task. In biology, for instance, the advances in the available technologies enable the generation of a very large number of biomarkers that describe the data. C...

Descripción completa

Detalles Bibliográficos
Autores principales: Zawbaa, Hossam M., Emary, E., Grosan, Crina
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4786139/
https://www.ncbi.nlm.nih.gov/pubmed/26963715
http://dx.doi.org/10.1371/journal.pone.0150652
_version_ 1782420504287641600
author Zawbaa, Hossam M.
Emary, E.
Grosan, Crina
author_facet Zawbaa, Hossam M.
Emary, E.
Grosan, Crina
author_sort Zawbaa, Hossam M.
collection PubMed
description BACKGROUND: Selecting a subset of relevant properties from a large set of features that describe a dataset is a challenging machine learning task. In biology, for instance, the advances in the available technologies enable the generation of a very large number of biomarkers that describe the data. Choosing the more informative markers along with performing a high-accuracy classification over the data can be a daunting task, particularly if the data are high dimensional. An often adopted approach is to formulate the feature selection problem as a biobjective optimization problem, with the aim of maximizing the performance of the data analysis model (the quality of the data training fitting) while minimizing the number of features used. RESULTS: We propose an optimization approach for the feature selection problem that considers a “chaotic” version of the antlion optimizer method, a nature-inspired algorithm that mimics the hunting mechanism of antlions in nature. The balance between exploration of the search space and exploitation of the best solutions is a challenge in multi-objective optimization. The exploration/exploitation rate is controlled by the parameter I that limits the random walk range of the ants/prey. This variable is increased iteratively in a quasi-linear manner to decrease the exploration rate as the optimization progresses. The quasi-linear decrease in the variable I may lead to immature convergence in some cases and trapping in local minima in other cases. The chaotic system proposed here attempts to improve the tradeoff between exploration and exploitation. The methodology is evaluated using different chaotic maps on a number of feature selection datasets. To ensure generality, we used ten biological datasets, but we also used other types of data from various sources. The results are compared with the particle swarm optimizer and with genetic algorithm variants for feature selection using a set of quality metrics.
format Online
Article
Text
id pubmed-4786139
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-47861392016-03-23 Feature Selection via Chaotic Antlion Optimization Zawbaa, Hossam M. Emary, E. Grosan, Crina PLoS One Research Article BACKGROUND: Selecting a subset of relevant properties from a large set of features that describe a dataset is a challenging machine learning task. In biology, for instance, the advances in the available technologies enable the generation of a very large number of biomarkers that describe the data. Choosing the more informative markers along with performing a high-accuracy classification over the data can be a daunting task, particularly if the data are high dimensional. An often adopted approach is to formulate the feature selection problem as a biobjective optimization problem, with the aim of maximizing the performance of the data analysis model (the quality of the data training fitting) while minimizing the number of features used. RESULTS: We propose an optimization approach for the feature selection problem that considers a “chaotic” version of the antlion optimizer method, a nature-inspired algorithm that mimics the hunting mechanism of antlions in nature. The balance between exploration of the search space and exploitation of the best solutions is a challenge in multi-objective optimization. The exploration/exploitation rate is controlled by the parameter I that limits the random walk range of the ants/prey. This variable is increased iteratively in a quasi-linear manner to decrease the exploration rate as the optimization progresses. The quasi-linear decrease in the variable I may lead to immature convergence in some cases and trapping in local minima in other cases. The chaotic system proposed here attempts to improve the tradeoff between exploration and exploitation. The methodology is evaluated using different chaotic maps on a number of feature selection datasets. To ensure generality, we used ten biological datasets, but we also used other types of data from various sources. The results are compared with the particle swarm optimizer and with genetic algorithm variants for feature selection using a set of quality metrics. Public Library of Science 2016-03-10 /pmc/articles/PMC4786139/ /pubmed/26963715 http://dx.doi.org/10.1371/journal.pone.0150652 Text en © 2016 Zawbaa et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Zawbaa, Hossam M.
Emary, E.
Grosan, Crina
Feature Selection via Chaotic Antlion Optimization
title Feature Selection via Chaotic Antlion Optimization
title_full Feature Selection via Chaotic Antlion Optimization
title_fullStr Feature Selection via Chaotic Antlion Optimization
title_full_unstemmed Feature Selection via Chaotic Antlion Optimization
title_short Feature Selection via Chaotic Antlion Optimization
title_sort feature selection via chaotic antlion optimization
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4786139/
https://www.ncbi.nlm.nih.gov/pubmed/26963715
http://dx.doi.org/10.1371/journal.pone.0150652
work_keys_str_mv AT zawbaahossamm featureselectionviachaoticantlionoptimization
AT emarye featureselectionviachaoticantlionoptimization
AT grosancrina featureselectionviachaoticantlionoptimization