Cargando…
Artificial Intelligence based wrapper for high dimensional feature selection
BACKGROUND: Feature selection is important in high dimensional data analysis. The wrapper approach is one of the ways to perform feature selection, but it is computationally intensive as it builds and evaluates models of multiple subsets of features. The existing wrapper algorithm primarily focuses...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10585895/ https://www.ncbi.nlm.nih.gov/pubmed/37853338 http://dx.doi.org/10.1186/s12859-023-05502-x |
_version_ | 1785123045018435584 |
---|---|
author | Jain, Rahi Xu, Wei |
author_facet | Jain, Rahi Xu, Wei |
author_sort | Jain, Rahi |
collection | PubMed |
description | BACKGROUND: Feature selection is important in high dimensional data analysis. The wrapper approach is one of the ways to perform feature selection, but it is computationally intensive as it builds and evaluates models of multiple subsets of features. The existing wrapper algorithm primarily focuses on shortening the path to find an optimal feature set. However, it underutilizes the capability of feature subset models, which impacts feature selection and its predictive performance. METHOD AND RESULTS: This study proposes a novel Artificial Intelligence based Wrapper (AIWrap) algorithm that integrates Artificial Intelligence (AI) with the existing wrapper algorithm. The algorithm develops a Performance Prediction Model using AI which predicts the model performance of any feature set and allows the wrapper algorithm to evaluate the feature subset performance in a model without building the model. The algorithm can make the wrapper algorithm more relevant for high-dimensional data. We evaluate the performance of this algorithm using simulated studies and real research studies. AIWrap shows better or at par feature selection and model prediction performance than standard penalized feature selection algorithms and wrapper algorithms. CONCLUSION: AIWrap approach provides an alternative algorithm to the existing algorithms for feature selection. The current study focuses on AIWrap application in continuous cross-sectional data. However, it could be applied to other datasets like longitudinal, categorical and time-to-event biological data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05502-x. |
format | Online Article Text |
id | pubmed-10585895 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-105858952023-10-20 Artificial Intelligence based wrapper for high dimensional feature selection Jain, Rahi Xu, Wei BMC Bioinformatics Research BACKGROUND: Feature selection is important in high dimensional data analysis. The wrapper approach is one of the ways to perform feature selection, but it is computationally intensive as it builds and evaluates models of multiple subsets of features. The existing wrapper algorithm primarily focuses on shortening the path to find an optimal feature set. However, it underutilizes the capability of feature subset models, which impacts feature selection and its predictive performance. METHOD AND RESULTS: This study proposes a novel Artificial Intelligence based Wrapper (AIWrap) algorithm that integrates Artificial Intelligence (AI) with the existing wrapper algorithm. The algorithm develops a Performance Prediction Model using AI which predicts the model performance of any feature set and allows the wrapper algorithm to evaluate the feature subset performance in a model without building the model. The algorithm can make the wrapper algorithm more relevant for high-dimensional data. We evaluate the performance of this algorithm using simulated studies and real research studies. AIWrap shows better or at par feature selection and model prediction performance than standard penalized feature selection algorithms and wrapper algorithms. CONCLUSION: AIWrap approach provides an alternative algorithm to the existing algorithms for feature selection. The current study focuses on AIWrap application in continuous cross-sectional data. However, it could be applied to other datasets like longitudinal, categorical and time-to-event biological data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05502-x. BioMed Central 2023-10-18 /pmc/articles/PMC10585895/ /pubmed/37853338 http://dx.doi.org/10.1186/s12859-023-05502-x Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Jain, Rahi Xu, Wei Artificial Intelligence based wrapper for high dimensional feature selection |
title | Artificial Intelligence based wrapper for high dimensional feature selection |
title_full | Artificial Intelligence based wrapper for high dimensional feature selection |
title_fullStr | Artificial Intelligence based wrapper for high dimensional feature selection |
title_full_unstemmed | Artificial Intelligence based wrapper for high dimensional feature selection |
title_short | Artificial Intelligence based wrapper for high dimensional feature selection |
title_sort | artificial intelligence based wrapper for high dimensional feature selection |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10585895/ https://www.ncbi.nlm.nih.gov/pubmed/37853338 http://dx.doi.org/10.1186/s12859-023-05502-x |
work_keys_str_mv | AT jainrahi artificialintelligencebasedwrapperforhighdimensionalfeatureselection AT xuwei artificialintelligencebasedwrapperforhighdimensionalfeatureselection |