Cargando…

Prediction and Variable Selection in High-Dimensional Misspecified Binary Classification

In this paper, we consider prediction and variable selection in the misspecified binary classification models under the high-dimensional scenario. We focus on two approaches to classification, which are computationally efficient, but lead to model misspecification. The first one is to apply penalize...

Descripción completa

Detalles Bibliográficos
Autores principales: Furmańczyk, Konrad, Rejchel, Wojciech
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517038/
https://www.ncbi.nlm.nih.gov/pubmed/33286314
http://dx.doi.org/10.3390/e22050543
_version_ 1783587137393786880
author Furmańczyk, Konrad
Rejchel, Wojciech
author_facet Furmańczyk, Konrad
Rejchel, Wojciech
author_sort Furmańczyk, Konrad
collection PubMed
description In this paper, we consider prediction and variable selection in the misspecified binary classification models under the high-dimensional scenario. We focus on two approaches to classification, which are computationally efficient, but lead to model misspecification. The first one is to apply penalized logistic regression to the classification data, which possibly do not follow the logistic model. The second method is even more radical: we just treat class labels of objects as they were numbers and apply penalized linear regression. In this paper, we investigate thoroughly these two approaches and provide conditions, which guarantee that they are successful in prediction and variable selection. Our results hold even if the number of predictors is much larger than the sample size. The paper is completed by the experimental results.
format Online
Article
Text
id pubmed-7517038
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75170382020-11-09 Prediction and Variable Selection in High-Dimensional Misspecified Binary Classification Furmańczyk, Konrad Rejchel, Wojciech Entropy (Basel) Article In this paper, we consider prediction and variable selection in the misspecified binary classification models under the high-dimensional scenario. We focus on two approaches to classification, which are computationally efficient, but lead to model misspecification. The first one is to apply penalized logistic regression to the classification data, which possibly do not follow the logistic model. The second method is even more radical: we just treat class labels of objects as they were numbers and apply penalized linear regression. In this paper, we investigate thoroughly these two approaches and provide conditions, which guarantee that they are successful in prediction and variable selection. Our results hold even if the number of predictors is much larger than the sample size. The paper is completed by the experimental results. MDPI 2020-05-13 /pmc/articles/PMC7517038/ /pubmed/33286314 http://dx.doi.org/10.3390/e22050543 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Furmańczyk, Konrad
Rejchel, Wojciech
Prediction and Variable Selection in High-Dimensional Misspecified Binary Classification
title Prediction and Variable Selection in High-Dimensional Misspecified Binary Classification
title_full Prediction and Variable Selection in High-Dimensional Misspecified Binary Classification
title_fullStr Prediction and Variable Selection in High-Dimensional Misspecified Binary Classification
title_full_unstemmed Prediction and Variable Selection in High-Dimensional Misspecified Binary Classification
title_short Prediction and Variable Selection in High-Dimensional Misspecified Binary Classification
title_sort prediction and variable selection in high-dimensional misspecified binary classification
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517038/
https://www.ncbi.nlm.nih.gov/pubmed/33286314
http://dx.doi.org/10.3390/e22050543
work_keys_str_mv AT furmanczykkonrad predictionandvariableselectioninhighdimensionalmisspecifiedbinaryclassification
AT rejchelwojciech predictionandvariableselectioninhighdimensionalmisspecifiedbinaryclassification