Cargando…

DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm

Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires id...

Descripción completa

Detalles Bibliográficos
Autores principales: Soufan, Othman, Kleftogiannis, Dimitrios, Kalnis, Panos, Bajic, Vladimir B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4342225/
https://www.ncbi.nlm.nih.gov/pubmed/25719748
http://dx.doi.org/10.1371/journal.pone.0117988
_version_ 1782359257564315648
author Soufan, Othman
Kleftogiannis, Dimitrios
Kalnis, Panos
Bajic, Vladimir B.
author_facet Soufan, Othman
Kleftogiannis, Dimitrios
Kalnis, Panos
Bajic, Vladimir B.
author_sort Soufan, Othman
collection PubMed
description Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of features reduces the problem’s dimensionality and may result in higher classification performance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates simultaneously large number of candidate collections of features. DWFS also integrates various filtering methods that may be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different biomedical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used existing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs.
format Online
Article
Text
id pubmed-4342225
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-43422252015-03-04 DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm Soufan, Othman Kleftogiannis, Dimitrios Kalnis, Panos Bajic, Vladimir B. PLoS One Research Article Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of features reduces the problem’s dimensionality and may result in higher classification performance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates simultaneously large number of candidate collections of features. DWFS also integrates various filtering methods that may be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different biomedical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used existing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs. Public Library of Science 2015-02-26 /pmc/articles/PMC4342225/ /pubmed/25719748 http://dx.doi.org/10.1371/journal.pone.0117988 Text en © 2015 Soufan et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Soufan, Othman
Kleftogiannis, Dimitrios
Kalnis, Panos
Bajic, Vladimir B.
DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm
title DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm
title_full DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm
title_fullStr DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm
title_full_unstemmed DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm
title_short DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm
title_sort dwfs: a wrapper feature selection tool based on a parallel genetic algorithm
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4342225/
https://www.ncbi.nlm.nih.gov/pubmed/25719748
http://dx.doi.org/10.1371/journal.pone.0117988
work_keys_str_mv AT soufanothman dwfsawrapperfeatureselectiontoolbasedonaparallelgeneticalgorithm
AT kleftogiannisdimitrios dwfsawrapperfeatureselectiontoolbasedonaparallelgeneticalgorithm
AT kalnispanos dwfsawrapperfeatureselectiontoolbasedonaparallelgeneticalgorithm
AT bajicvladimirb dwfsawrapperfeatureselectiontoolbasedonaparallelgeneticalgorithm