Cargando…

Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models

The two-stage feature screening method for linear models applies dimension reduction at first stage to screen out nuisance features and dramatically reduce the dimension to a moderate size; at the second stage, penalized methods such as LASSO and SCAD could be applied for feature selection. A majori...

Descripción completa

Detalles Bibliográficos
Autores principales: Jiang, Jinzhu, Shang, Junfeng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10296932/
https://www.ncbi.nlm.nih.gov/pubmed/37372195
http://dx.doi.org/10.3390/e25060851
_version_ 1785063764620476416
author Jiang, Jinzhu
Shang, Junfeng
author_facet Jiang, Jinzhu
Shang, Junfeng
author_sort Jiang, Jinzhu
collection PubMed
description The two-stage feature screening method for linear models applies dimension reduction at first stage to screen out nuisance features and dramatically reduce the dimension to a moderate size; at the second stage, penalized methods such as LASSO and SCAD could be applied for feature selection. A majority of subsequent works on the sure independent screening methods have focused mainly on the linear model. This motivates us to extend the independence screening method to generalized linear models, and particularly with binary response by using the point-biserial correlation. We develop a two-stage feature screening method called point-biserial sure independence screening (PB-SIS) for high-dimensional generalized linear models, aiming for high selection accuracy and low computational cost. We demonstrate that PB-SIS is a feature screening method with high efficiency. The PB-SIS method possesses the sure independence property under certain regularity conditions. A set of simulation studies are conducted and confirm the sure independence property and the accuracy and efficiency of PB-SIS. Finally we apply PB-SIS to one real data example to show its effectiveness.
format Online
Article
Text
id pubmed-10296932
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-102969322023-06-28 Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models Jiang, Jinzhu Shang, Junfeng Entropy (Basel) Article The two-stage feature screening method for linear models applies dimension reduction at first stage to screen out nuisance features and dramatically reduce the dimension to a moderate size; at the second stage, penalized methods such as LASSO and SCAD could be applied for feature selection. A majority of subsequent works on the sure independent screening methods have focused mainly on the linear model. This motivates us to extend the independence screening method to generalized linear models, and particularly with binary response by using the point-biserial correlation. We develop a two-stage feature screening method called point-biserial sure independence screening (PB-SIS) for high-dimensional generalized linear models, aiming for high selection accuracy and low computational cost. We demonstrate that PB-SIS is a feature screening method with high efficiency. The PB-SIS method possesses the sure independence property under certain regularity conditions. A set of simulation studies are conducted and confirm the sure independence property and the accuracy and efficiency of PB-SIS. Finally we apply PB-SIS to one real data example to show its effectiveness. MDPI 2023-05-26 /pmc/articles/PMC10296932/ /pubmed/37372195 http://dx.doi.org/10.3390/e25060851 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Jiang, Jinzhu
Shang, Junfeng
Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models
title Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models
title_full Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models
title_fullStr Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models
title_full_unstemmed Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models
title_short Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models
title_sort feature screening for high-dimensional variable selection in generalized linear models
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10296932/
https://www.ncbi.nlm.nih.gov/pubmed/37372195
http://dx.doi.org/10.3390/e25060851
work_keys_str_mv AT jiangjinzhu featurescreeningforhighdimensionalvariableselectioningeneralizedlinearmodels
AT shangjunfeng featurescreeningforhighdimensionalvariableselectioningeneralizedlinearmodels