Cargando…

Simple Stopping Criteria for Information Theoretic Feature Selection

Feature selection aims to select the smallest feature subset that yields the minimum generalization error. In the rich literature in feature selection, information theory-based approaches seek a subset of features such that the mutual information between the selected features and the class labels is...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Shujian, Príncipe, José C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7514210/
https://www.ncbi.nlm.nih.gov/pubmed/33266815
http://dx.doi.org/10.3390/e21010099
_version_ 1783586535721926656
author Yu, Shujian
Príncipe, José C.
author_facet Yu, Shujian
Príncipe, José C.
author_sort Yu, Shujian
collection PubMed
description Feature selection aims to select the smallest feature subset that yields the minimum generalization error. In the rich literature in feature selection, information theory-based approaches seek a subset of features such that the mutual information between the selected features and the class labels is maximized. Despite the simplicity of this objective, there still remain several open problems in optimization. These include, for example, the automatic determination of the optimal subset size (i.e., the number of features) or a stopping criterion if the greedy searching strategy is adopted. In this paper, we suggest two stopping criteria by just monitoring the conditional mutual information (CMI) among groups of variables. Using the recently developed multivariate matrix-based Rényi’s [Formula: see text]-entropy functional, which can be directly estimated from data samples, we showed that the CMI among groups of variables can be easily computed without any decomposition or approximation, hence making our criteria easy to implement and seamlessly integrated into any existing information theoretic feature selection methods with a greedy search strategy.
format Online
Article
Text
id pubmed-7514210
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75142102020-11-09 Simple Stopping Criteria for Information Theoretic Feature Selection Yu, Shujian Príncipe, José C. Entropy (Basel) Article Feature selection aims to select the smallest feature subset that yields the minimum generalization error. In the rich literature in feature selection, information theory-based approaches seek a subset of features such that the mutual information between the selected features and the class labels is maximized. Despite the simplicity of this objective, there still remain several open problems in optimization. These include, for example, the automatic determination of the optimal subset size (i.e., the number of features) or a stopping criterion if the greedy searching strategy is adopted. In this paper, we suggest two stopping criteria by just monitoring the conditional mutual information (CMI) among groups of variables. Using the recently developed multivariate matrix-based Rényi’s [Formula: see text]-entropy functional, which can be directly estimated from data samples, we showed that the CMI among groups of variables can be easily computed without any decomposition or approximation, hence making our criteria easy to implement and seamlessly integrated into any existing information theoretic feature selection methods with a greedy search strategy. MDPI 2019-01-21 /pmc/articles/PMC7514210/ /pubmed/33266815 http://dx.doi.org/10.3390/e21010099 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Yu, Shujian
Príncipe, José C.
Simple Stopping Criteria for Information Theoretic Feature Selection
title Simple Stopping Criteria for Information Theoretic Feature Selection
title_full Simple Stopping Criteria for Information Theoretic Feature Selection
title_fullStr Simple Stopping Criteria for Information Theoretic Feature Selection
title_full_unstemmed Simple Stopping Criteria for Information Theoretic Feature Selection
title_short Simple Stopping Criteria for Information Theoretic Feature Selection
title_sort simple stopping criteria for information theoretic feature selection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7514210/
https://www.ncbi.nlm.nih.gov/pubmed/33266815
http://dx.doi.org/10.3390/e21010099
work_keys_str_mv AT yushujian simplestoppingcriteriaforinformationtheoreticfeatureselection
AT principejosec simplestoppingcriteriaforinformationtheoreticfeatureselection