Cargando…
Feature Selection based on the Local Lift Dependence Scale
This paper uses a classical approach to feature selection: minimization of a cost function applied on estimated joint distributions. However, in this new formulation, the optimization search space is extended. The original search space is the Boolean lattice of features sets (BLFS), while the extend...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7512664/ https://www.ncbi.nlm.nih.gov/pubmed/33265188 http://dx.doi.org/10.3390/e20020097 |
_version_ | 1783586210252324864 |
---|---|
author | Marcondes, Diego Simonis, Adilson Barrera, Junior |
author_facet | Marcondes, Diego Simonis, Adilson Barrera, Junior |
author_sort | Marcondes, Diego |
collection | PubMed |
description | This paper uses a classical approach to feature selection: minimization of a cost function applied on estimated joint distributions. However, in this new formulation, the optimization search space is extended. The original search space is the Boolean lattice of features sets (BLFS), while the extended one is a collection of Boolean lattices of ordered pairs (CBLOP), that is (features, associated value), indexed by the elements of the BLFS. In this approach, we may not only select the features that are most related to a variable Y, but also select the values of the features that most influence the variable or that are most prone to have a specific value of Y. A local formulation of Shannon’s mutual information, which generalizes Shannon’s original definition, is applied on a CBLOP to generate a multiple resolution scale for characterizing variable dependence, the Local Lift Dependence Scale (LLDS). The main contribution of this paper is to define and apply the LLDS to analyse local properties of joint distributions that are neglected by the classical Shannon’s global measure in order to select features. This approach is applied to select features based on the dependence between: i—the performance of students on university entrance exams and on courses of their first semester in the university; ii—the congress representative party and his vote on different matters; iii—the cover type of terrains and several terrain properties. |
format | Online Article Text |
id | pubmed-7512664 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-75126642020-11-09 Feature Selection based on the Local Lift Dependence Scale Marcondes, Diego Simonis, Adilson Barrera, Junior Entropy (Basel) Article This paper uses a classical approach to feature selection: minimization of a cost function applied on estimated joint distributions. However, in this new formulation, the optimization search space is extended. The original search space is the Boolean lattice of features sets (BLFS), while the extended one is a collection of Boolean lattices of ordered pairs (CBLOP), that is (features, associated value), indexed by the elements of the BLFS. In this approach, we may not only select the features that are most related to a variable Y, but also select the values of the features that most influence the variable or that are most prone to have a specific value of Y. A local formulation of Shannon’s mutual information, which generalizes Shannon’s original definition, is applied on a CBLOP to generate a multiple resolution scale for characterizing variable dependence, the Local Lift Dependence Scale (LLDS). The main contribution of this paper is to define and apply the LLDS to analyse local properties of joint distributions that are neglected by the classical Shannon’s global measure in order to select features. This approach is applied to select features based on the dependence between: i—the performance of students on university entrance exams and on courses of their first semester in the university; ii—the congress representative party and his vote on different matters; iii—the cover type of terrains and several terrain properties. MDPI 2018-01-30 /pmc/articles/PMC7512664/ /pubmed/33265188 http://dx.doi.org/10.3390/e20020097 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Marcondes, Diego Simonis, Adilson Barrera, Junior Feature Selection based on the Local Lift Dependence Scale |
title | Feature Selection based on the Local Lift Dependence Scale |
title_full | Feature Selection based on the Local Lift Dependence Scale |
title_fullStr | Feature Selection based on the Local Lift Dependence Scale |
title_full_unstemmed | Feature Selection based on the Local Lift Dependence Scale |
title_short | Feature Selection based on the Local Lift Dependence Scale |
title_sort | feature selection based on the local lift dependence scale |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7512664/ https://www.ncbi.nlm.nih.gov/pubmed/33265188 http://dx.doi.org/10.3390/e20020097 |
work_keys_str_mv | AT marcondesdiego featureselectionbasedonthelocalliftdependencescale AT simonisadilson featureselectionbasedonthelocalliftdependencescale AT barrerajunior featureselectionbasedonthelocalliftdependencescale |