Cargando…

Information Theoretic Methods for Variable Selection—A Review

We review the principal information theoretic tools and their use for feature selection, with the main emphasis on classification problems with discrete features. Since it is known that empirical versions of conditional mutual information perform poorly for high-dimensional problems, we focus on var...

Descripción completa

Detalles Bibliográficos
Autor principal: Mielniczuk, Jan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9407310/
https://www.ncbi.nlm.nih.gov/pubmed/36010742
http://dx.doi.org/10.3390/e24081079
_version_ 1784774333140303872
author Mielniczuk, Jan
author_facet Mielniczuk, Jan
author_sort Mielniczuk, Jan
collection PubMed
description We review the principal information theoretic tools and their use for feature selection, with the main emphasis on classification problems with discrete features. Since it is known that empirical versions of conditional mutual information perform poorly for high-dimensional problems, we focus on various ways of constructing its counterparts and the properties and limitations of such methods. We present a unified way of constructing such measures based on truncation, or truncation and weighing, for the Möbius expansion of conditional mutual information. We also discuss the main approaches to feature selection which apply the introduced measures of conditional dependence, together with the ways of assessing the quality of the obtained vector of predictors. This involves discussion of recent results on asymptotic distributions of empirical counterparts of criteria, as well as advances in resampling.
format Online
Article
Text
id pubmed-9407310
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-94073102022-08-26 Information Theoretic Methods for Variable Selection—A Review Mielniczuk, Jan Entropy (Basel) Article We review the principal information theoretic tools and their use for feature selection, with the main emphasis on classification problems with discrete features. Since it is known that empirical versions of conditional mutual information perform poorly for high-dimensional problems, we focus on various ways of constructing its counterparts and the properties and limitations of such methods. We present a unified way of constructing such measures based on truncation, or truncation and weighing, for the Möbius expansion of conditional mutual information. We also discuss the main approaches to feature selection which apply the introduced measures of conditional dependence, together with the ways of assessing the quality of the obtained vector of predictors. This involves discussion of recent results on asymptotic distributions of empirical counterparts of criteria, as well as advances in resampling. MDPI 2022-08-04 /pmc/articles/PMC9407310/ /pubmed/36010742 http://dx.doi.org/10.3390/e24081079 Text en © 2022 by the author. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Mielniczuk, Jan
Information Theoretic Methods for Variable Selection—A Review
title Information Theoretic Methods for Variable Selection—A Review
title_full Information Theoretic Methods for Variable Selection—A Review
title_fullStr Information Theoretic Methods for Variable Selection—A Review
title_full_unstemmed Information Theoretic Methods for Variable Selection—A Review
title_short Information Theoretic Methods for Variable Selection—A Review
title_sort information theoretic methods for variable selection—a review
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9407310/
https://www.ncbi.nlm.nih.gov/pubmed/36010742
http://dx.doi.org/10.3390/e24081079
work_keys_str_mv AT mielniczukjan informationtheoreticmethodsforvariableselectionareview