Cargando…

Feature Selection for Chemical Sensor Arrays Using Mutual Information

We address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best featur...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, X. Rosalind, Lizier, Joseph T., Nowotny, Thomas, Berna, Amalia Z., Prokopenko, Mikhail, Trowell, Stephen C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3942325/
https://www.ncbi.nlm.nih.gov/pubmed/24595058
http://dx.doi.org/10.1371/journal.pone.0089840
_version_ 1782479048447885312
author Wang, X. Rosalind
Lizier, Joseph T.
Nowotny, Thomas
Berna, Amalia Z.
Prokopenko, Mikhail
Trowell, Stephen C.
author_facet Wang, X. Rosalind
Lizier, Joseph T.
Nowotny, Thomas
Berna, Amalia Z.
Prokopenko, Mikhail
Trowell, Stephen C.
author_sort Wang, X. Rosalind
collection PubMed
description We address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best features and upper bounds on classification performance. We selected feature sets that exhibit the maximal mutual information with the identity of the chemicals. The selected features closely match those found to perform well in the previous study using a wrapper approach to conduct an exhaustive search of all permitted feature combinations. By comparing the classification performance of support vector machines (using features selected by mutual information) with the performance observed in the previous study, we found that while our approach does not always give the maximum possible classification performance, it always selects features that achieve classification performance approaching the optimum obtained by exhaustive search. We performed further classification using the selected feature set with some common classifiers and found that, for the selected features, Bayesian Networks gave the best performance. Finally, we compared the observed classification performances with the performance of classifiers using randomly selected features. We found that the selected features consistently outperformed randomly selected features for all tested classifiers. The mutual information filter approach is therefore a computationally efficient method for selecting near optimal features for chemical sensor arrays.
format Online
Article
Text
id pubmed-3942325
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-39423252014-03-06 Feature Selection for Chemical Sensor Arrays Using Mutual Information Wang, X. Rosalind Lizier, Joseph T. Nowotny, Thomas Berna, Amalia Z. Prokopenko, Mikhail Trowell, Stephen C. PLoS One Research Article We address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best features and upper bounds on classification performance. We selected feature sets that exhibit the maximal mutual information with the identity of the chemicals. The selected features closely match those found to perform well in the previous study using a wrapper approach to conduct an exhaustive search of all permitted feature combinations. By comparing the classification performance of support vector machines (using features selected by mutual information) with the performance observed in the previous study, we found that while our approach does not always give the maximum possible classification performance, it always selects features that achieve classification performance approaching the optimum obtained by exhaustive search. We performed further classification using the selected feature set with some common classifiers and found that, for the selected features, Bayesian Networks gave the best performance. Finally, we compared the observed classification performances with the performance of classifiers using randomly selected features. We found that the selected features consistently outperformed randomly selected features for all tested classifiers. The mutual information filter approach is therefore a computationally efficient method for selecting near optimal features for chemical sensor arrays. Public Library of Science 2014-03-04 /pmc/articles/PMC3942325/ /pubmed/24595058 http://dx.doi.org/10.1371/journal.pone.0089840 Text en © 2014 Wang et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Wang, X. Rosalind
Lizier, Joseph T.
Nowotny, Thomas
Berna, Amalia Z.
Prokopenko, Mikhail
Trowell, Stephen C.
Feature Selection for Chemical Sensor Arrays Using Mutual Information
title Feature Selection for Chemical Sensor Arrays Using Mutual Information
title_full Feature Selection for Chemical Sensor Arrays Using Mutual Information
title_fullStr Feature Selection for Chemical Sensor Arrays Using Mutual Information
title_full_unstemmed Feature Selection for Chemical Sensor Arrays Using Mutual Information
title_short Feature Selection for Chemical Sensor Arrays Using Mutual Information
title_sort feature selection for chemical sensor arrays using mutual information
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3942325/
https://www.ncbi.nlm.nih.gov/pubmed/24595058
http://dx.doi.org/10.1371/journal.pone.0089840
work_keys_str_mv AT wangxrosalind featureselectionforchemicalsensorarraysusingmutualinformation
AT lizierjosepht featureselectionforchemicalsensorarraysusingmutualinformation
AT nowotnythomas featureselectionforchemicalsensorarraysusingmutualinformation
AT bernaamaliaz featureselectionforchemicalsensorarraysusingmutualinformation
AT prokopenkomikhail featureselectionforchemicalsensorarraysusingmutualinformation
AT trowellstephenc featureselectionforchemicalsensorarraysusingmutualinformation