Cargando…

Model selection in multivariate adaptive regressions splines (MARS) using alternative information criteria

Multivariate Adaptive Regression Splines (MARS) is a useful non-parametric regression analysis method that can be used for model selection in high-dimensional data. Since MARS can identify and model complex, non-linear relationships between the dependent variable and independent variables without re...

Descripción completa

Detalles Bibliográficos
Autores principales: Bekar Adiguzel, Meryem, Cengiz, Mehmet Ali
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10559550/
https://www.ncbi.nlm.nih.gov/pubmed/37809827
http://dx.doi.org/10.1016/j.heliyon.2023.e19964
_version_ 1785117525212659712
author Bekar Adiguzel, Meryem
Cengiz, Mehmet Ali
author_facet Bekar Adiguzel, Meryem
Cengiz, Mehmet Ali
author_sort Bekar Adiguzel, Meryem
collection PubMed
description Multivariate Adaptive Regression Splines (MARS) is a useful non-parametric regression analysis method that can be used for model selection in high-dimensional data. Since MARS can identify and model complex, non-linear relationships between the dependent variable and independent variables without requiring any assumptions, it has advantage over simple linear regression techniques. Also, for simplifying the model building process and preventing overfitting, MARS can select automatically the variables to be included in the model, which is useful for datasets with many variables. While MARS is a flexible non-parametric regression method, generalized cross validation (GCV) technique is used within the MARS framework to avoid overfitting and to select the best model. GCV criterion is widely used and can be effective in many situations, however it has some criticism. These criticism are the arbitrary value of the smoothing parameter used in the algorithm of the GCV criterion and the models obtained using this criterion are high-dimensional. In this paper, it is aimed to obtain the barest model that best explains the relationship between the dependent variable and independent variables by using alternative information criteria (Akaike information criterion (AIC), Schwarz Bayesian criterion (SBC) and information complexity criterion ([Formula: see text])) instead of the use of smoothing parameters in order to put an end to the criticism. To achieve this goal, a simulation study was first conducted with a data set composed of variables that do and do not contribute to the dependent variable to test the success of the information criteria. As a consequence of this simulation work, when variables (which do not contribute to the dependent variable) are not included in the regression model, it demonstrates the success of the criteria in model selection. As a real data set, the reasons for loan defaults were investigated between the years 2005–2019 by utilizing data from 18 banks operating in Türkiye. The results obtained reveal the success of [Formula: see text] criterion in model selection.
format Online
Article
Text
id pubmed-10559550
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-105595502023-10-08 Model selection in multivariate adaptive regressions splines (MARS) using alternative information criteria Bekar Adiguzel, Meryem Cengiz, Mehmet Ali Heliyon Research Article Multivariate Adaptive Regression Splines (MARS) is a useful non-parametric regression analysis method that can be used for model selection in high-dimensional data. Since MARS can identify and model complex, non-linear relationships between the dependent variable and independent variables without requiring any assumptions, it has advantage over simple linear regression techniques. Also, for simplifying the model building process and preventing overfitting, MARS can select automatically the variables to be included in the model, which is useful for datasets with many variables. While MARS is a flexible non-parametric regression method, generalized cross validation (GCV) technique is used within the MARS framework to avoid overfitting and to select the best model. GCV criterion is widely used and can be effective in many situations, however it has some criticism. These criticism are the arbitrary value of the smoothing parameter used in the algorithm of the GCV criterion and the models obtained using this criterion are high-dimensional. In this paper, it is aimed to obtain the barest model that best explains the relationship between the dependent variable and independent variables by using alternative information criteria (Akaike information criterion (AIC), Schwarz Bayesian criterion (SBC) and information complexity criterion ([Formula: see text])) instead of the use of smoothing parameters in order to put an end to the criticism. To achieve this goal, a simulation study was first conducted with a data set composed of variables that do and do not contribute to the dependent variable to test the success of the information criteria. As a consequence of this simulation work, when variables (which do not contribute to the dependent variable) are not included in the regression model, it demonstrates the success of the criteria in model selection. As a real data set, the reasons for loan defaults were investigated between the years 2005–2019 by utilizing data from 18 banks operating in Türkiye. The results obtained reveal the success of [Formula: see text] criterion in model selection. Elsevier 2023-09-17 /pmc/articles/PMC10559550/ /pubmed/37809827 http://dx.doi.org/10.1016/j.heliyon.2023.e19964 Text en © 2023 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Bekar Adiguzel, Meryem
Cengiz, Mehmet Ali
Model selection in multivariate adaptive regressions splines (MARS) using alternative information criteria
title Model selection in multivariate adaptive regressions splines (MARS) using alternative information criteria
title_full Model selection in multivariate adaptive regressions splines (MARS) using alternative information criteria
title_fullStr Model selection in multivariate adaptive regressions splines (MARS) using alternative information criteria
title_full_unstemmed Model selection in multivariate adaptive regressions splines (MARS) using alternative information criteria
title_short Model selection in multivariate adaptive regressions splines (MARS) using alternative information criteria
title_sort model selection in multivariate adaptive regressions splines (mars) using alternative information criteria
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10559550/
https://www.ncbi.nlm.nih.gov/pubmed/37809827
http://dx.doi.org/10.1016/j.heliyon.2023.e19964
work_keys_str_mv AT bekaradiguzelmeryem modelselectioninmultivariateadaptiveregressionssplinesmarsusingalternativeinformationcriteria
AT cengizmehmetali modelselectioninmultivariateadaptiveregressionssplinesmarsusingalternativeinformationcriteria