Cargando…

Robust model selection using the out-of-bag bootstrap in linear regression

Outlying observations have a large influence on the linear model selection process. In this article, we present a novel approach to robust model selection in linear regression to accommodate the situations where outliers are present in the data. The model selection criterion is based on two componen...

Descripción completa

Detalles Bibliográficos
Autores principales: Rabbi, Fazli, Khalil, Alamgir, Khan, Ilyas, Almuqrin, Muqrin A., Khalil, Umair, Andualem, Mulugeta
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9243146/
https://www.ncbi.nlm.nih.gov/pubmed/35768449
http://dx.doi.org/10.1038/s41598-022-14398-1
_version_ 1784738241493073920
author Rabbi, Fazli
Khalil, Alamgir
Khan, Ilyas
Almuqrin, Muqrin A.
Khalil, Umair
Andualem, Mulugeta
author_facet Rabbi, Fazli
Khalil, Alamgir
Khan, Ilyas
Almuqrin, Muqrin A.
Khalil, Umair
Andualem, Mulugeta
author_sort Rabbi, Fazli
collection PubMed
description Outlying observations have a large influence on the linear model selection process. In this article, we present a novel approach to robust model selection in linear regression to accommodate the situations where outliers are present in the data. The model selection criterion is based on two components, the robust conditional expected prediction loss, and a robust goodness-of-fit with a penalty term. We estimate the conditional expected prediction loss by using the out-of-bag stratified bootstrap approach. In the presence of outliers, the stratified bootstrap ensures that we obtain bootstrap samples that are similar to the original sample data. Furthermore, to control the undue effect of outliers, we use the robust MM-estimator and a bounded loss function in the proposed criterion. Specifically, we observe that instead of minimizing the penalized loss function or the conditional expected prediction loss separately, it is better to minimize them simultaneously. The simulation and real-data based studies confirm the consistent and satisfactory behavior of our bootstrap model selection procedure in the presence of response outliers and covariate outliers.
format Online
Article
Text
id pubmed-9243146
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-92431462022-07-01 Robust model selection using the out-of-bag bootstrap in linear regression Rabbi, Fazli Khalil, Alamgir Khan, Ilyas Almuqrin, Muqrin A. Khalil, Umair Andualem, Mulugeta Sci Rep Article Outlying observations have a large influence on the linear model selection process. In this article, we present a novel approach to robust model selection in linear regression to accommodate the situations where outliers are present in the data. The model selection criterion is based on two components, the robust conditional expected prediction loss, and a robust goodness-of-fit with a penalty term. We estimate the conditional expected prediction loss by using the out-of-bag stratified bootstrap approach. In the presence of outliers, the stratified bootstrap ensures that we obtain bootstrap samples that are similar to the original sample data. Furthermore, to control the undue effect of outliers, we use the robust MM-estimator and a bounded loss function in the proposed criterion. Specifically, we observe that instead of minimizing the penalized loss function or the conditional expected prediction loss separately, it is better to minimize them simultaneously. The simulation and real-data based studies confirm the consistent and satisfactory behavior of our bootstrap model selection procedure in the presence of response outliers and covariate outliers. Nature Publishing Group UK 2022-06-29 /pmc/articles/PMC9243146/ /pubmed/35768449 http://dx.doi.org/10.1038/s41598-022-14398-1 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Rabbi, Fazli
Khalil, Alamgir
Khan, Ilyas
Almuqrin, Muqrin A.
Khalil, Umair
Andualem, Mulugeta
Robust model selection using the out-of-bag bootstrap in linear regression
title Robust model selection using the out-of-bag bootstrap in linear regression
title_full Robust model selection using the out-of-bag bootstrap in linear regression
title_fullStr Robust model selection using the out-of-bag bootstrap in linear regression
title_full_unstemmed Robust model selection using the out-of-bag bootstrap in linear regression
title_short Robust model selection using the out-of-bag bootstrap in linear regression
title_sort robust model selection using the out-of-bag bootstrap in linear regression
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9243146/
https://www.ncbi.nlm.nih.gov/pubmed/35768449
http://dx.doi.org/10.1038/s41598-022-14398-1
work_keys_str_mv AT rabbifazli robustmodelselectionusingtheoutofbagbootstrapinlinearregression
AT khalilalamgir robustmodelselectionusingtheoutofbagbootstrapinlinearregression
AT khanilyas robustmodelselectionusingtheoutofbagbootstrapinlinearregression
AT almuqrinmuqrina robustmodelselectionusingtheoutofbagbootstrapinlinearregression
AT khalilumair robustmodelselectionusingtheoutofbagbootstrapinlinearregression
AT andualemmulugeta robustmodelselectionusingtheoutofbagbootstrapinlinearregression