Cargando…
Extending greedy feature selection algorithms to multiple solutions
Most feature selection methods identify only a single solution. This is acceptable for predictive purposes, but is not sufficient for knowledge discovery if multiple solutions exist. We propose a strategy to extend a class of greedy methods to efficiently identify multiple solutions, and show under...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8550441/ https://www.ncbi.nlm.nih.gov/pubmed/34720675 http://dx.doi.org/10.1007/s10618-020-00731-7 |
_version_ | 1784590961138991104 |
---|---|
author | Borboudakis, Giorgos Tsamardinos, Ioannis |
author_facet | Borboudakis, Giorgos Tsamardinos, Ioannis |
author_sort | Borboudakis, Giorgos |
collection | PubMed |
description | Most feature selection methods identify only a single solution. This is acceptable for predictive purposes, but is not sufficient for knowledge discovery if multiple solutions exist. We propose a strategy to extend a class of greedy methods to efficiently identify multiple solutions, and show under which conditions it identifies all solutions. We also introduce a taxonomy of features that takes the existence of multiple solutions into account. Furthermore, we explore different definitions of statistical equivalence of solutions, as well as methods for testing equivalence. A novel algorithm for compactly representing and visualizing multiple solutions is also introduced. In experiments we show that (a) the proposed algorithm is significantly more computationally efficient than the TIE* algorithm, the only alternative approach with similar theoretical guarantees, while identifying similar solutions to it, and (b) that the identified solutions have similar predictive performance. |
format | Online Article Text |
id | pubmed-8550441 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-85504412021-10-29 Extending greedy feature selection algorithms to multiple solutions Borboudakis, Giorgos Tsamardinos, Ioannis Data Min Knowl Discov Article Most feature selection methods identify only a single solution. This is acceptable for predictive purposes, but is not sufficient for knowledge discovery if multiple solutions exist. We propose a strategy to extend a class of greedy methods to efficiently identify multiple solutions, and show under which conditions it identifies all solutions. We also introduce a taxonomy of features that takes the existence of multiple solutions into account. Furthermore, we explore different definitions of statistical equivalence of solutions, as well as methods for testing equivalence. A novel algorithm for compactly representing and visualizing multiple solutions is also introduced. In experiments we show that (a) the proposed algorithm is significantly more computationally efficient than the TIE* algorithm, the only alternative approach with similar theoretical guarantees, while identifying similar solutions to it, and (b) that the identified solutions have similar predictive performance. Springer US 2021-05-01 2021 /pmc/articles/PMC8550441/ /pubmed/34720675 http://dx.doi.org/10.1007/s10618-020-00731-7 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Borboudakis, Giorgos Tsamardinos, Ioannis Extending greedy feature selection algorithms to multiple solutions |
title | Extending greedy feature selection algorithms to multiple solutions |
title_full | Extending greedy feature selection algorithms to multiple solutions |
title_fullStr | Extending greedy feature selection algorithms to multiple solutions |
title_full_unstemmed | Extending greedy feature selection algorithms to multiple solutions |
title_short | Extending greedy feature selection algorithms to multiple solutions |
title_sort | extending greedy feature selection algorithms to multiple solutions |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8550441/ https://www.ncbi.nlm.nih.gov/pubmed/34720675 http://dx.doi.org/10.1007/s10618-020-00731-7 |
work_keys_str_mv | AT borboudakisgiorgos extendinggreedyfeatureselectionalgorithmstomultiplesolutions AT tsamardinosioannis extendinggreedyfeatureselectionalgorithmstomultiplesolutions |