Cargando…
Model selection in multivariate adaptive regressions splines (MARS) using alternative information criteria
Multivariate Adaptive Regression Splines (MARS) is a useful non-parametric regression analysis method that can be used for model selection in high-dimensional data. Since MARS can identify and model complex, non-linear relationships between the dependent variable and independent variables without re...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10559550/ https://www.ncbi.nlm.nih.gov/pubmed/37809827 http://dx.doi.org/10.1016/j.heliyon.2023.e19964 |
_version_ | 1785117525212659712 |
---|---|
author | Bekar Adiguzel, Meryem Cengiz, Mehmet Ali |
author_facet | Bekar Adiguzel, Meryem Cengiz, Mehmet Ali |
author_sort | Bekar Adiguzel, Meryem |
collection | PubMed |
description | Multivariate Adaptive Regression Splines (MARS) is a useful non-parametric regression analysis method that can be used for model selection in high-dimensional data. Since MARS can identify and model complex, non-linear relationships between the dependent variable and independent variables without requiring any assumptions, it has advantage over simple linear regression techniques. Also, for simplifying the model building process and preventing overfitting, MARS can select automatically the variables to be included in the model, which is useful for datasets with many variables. While MARS is a flexible non-parametric regression method, generalized cross validation (GCV) technique is used within the MARS framework to avoid overfitting and to select the best model. GCV criterion is widely used and can be effective in many situations, however it has some criticism. These criticism are the arbitrary value of the smoothing parameter used in the algorithm of the GCV criterion and the models obtained using this criterion are high-dimensional. In this paper, it is aimed to obtain the barest model that best explains the relationship between the dependent variable and independent variables by using alternative information criteria (Akaike information criterion (AIC), Schwarz Bayesian criterion (SBC) and information complexity criterion ([Formula: see text])) instead of the use of smoothing parameters in order to put an end to the criticism. To achieve this goal, a simulation study was first conducted with a data set composed of variables that do and do not contribute to the dependent variable to test the success of the information criteria. As a consequence of this simulation work, when variables (which do not contribute to the dependent variable) are not included in the regression model, it demonstrates the success of the criteria in model selection. As a real data set, the reasons for loan defaults were investigated between the years 2005–2019 by utilizing data from 18 banks operating in Türkiye. The results obtained reveal the success of [Formula: see text] criterion in model selection. |
format | Online Article Text |
id | pubmed-10559550 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-105595502023-10-08 Model selection in multivariate adaptive regressions splines (MARS) using alternative information criteria Bekar Adiguzel, Meryem Cengiz, Mehmet Ali Heliyon Research Article Multivariate Adaptive Regression Splines (MARS) is a useful non-parametric regression analysis method that can be used for model selection in high-dimensional data. Since MARS can identify and model complex, non-linear relationships between the dependent variable and independent variables without requiring any assumptions, it has advantage over simple linear regression techniques. Also, for simplifying the model building process and preventing overfitting, MARS can select automatically the variables to be included in the model, which is useful for datasets with many variables. While MARS is a flexible non-parametric regression method, generalized cross validation (GCV) technique is used within the MARS framework to avoid overfitting and to select the best model. GCV criterion is widely used and can be effective in many situations, however it has some criticism. These criticism are the arbitrary value of the smoothing parameter used in the algorithm of the GCV criterion and the models obtained using this criterion are high-dimensional. In this paper, it is aimed to obtain the barest model that best explains the relationship between the dependent variable and independent variables by using alternative information criteria (Akaike information criterion (AIC), Schwarz Bayesian criterion (SBC) and information complexity criterion ([Formula: see text])) instead of the use of smoothing parameters in order to put an end to the criticism. To achieve this goal, a simulation study was first conducted with a data set composed of variables that do and do not contribute to the dependent variable to test the success of the information criteria. As a consequence of this simulation work, when variables (which do not contribute to the dependent variable) are not included in the regression model, it demonstrates the success of the criteria in model selection. As a real data set, the reasons for loan defaults were investigated between the years 2005–2019 by utilizing data from 18 banks operating in Türkiye. The results obtained reveal the success of [Formula: see text] criterion in model selection. Elsevier 2023-09-17 /pmc/articles/PMC10559550/ /pubmed/37809827 http://dx.doi.org/10.1016/j.heliyon.2023.e19964 Text en © 2023 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Research Article Bekar Adiguzel, Meryem Cengiz, Mehmet Ali Model selection in multivariate adaptive regressions splines (MARS) using alternative information criteria |
title | Model selection in multivariate adaptive regressions splines (MARS) using alternative information criteria |
title_full | Model selection in multivariate adaptive regressions splines (MARS) using alternative information criteria |
title_fullStr | Model selection in multivariate adaptive regressions splines (MARS) using alternative information criteria |
title_full_unstemmed | Model selection in multivariate adaptive regressions splines (MARS) using alternative information criteria |
title_short | Model selection in multivariate adaptive regressions splines (MARS) using alternative information criteria |
title_sort | model selection in multivariate adaptive regressions splines (mars) using alternative information criteria |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10559550/ https://www.ncbi.nlm.nih.gov/pubmed/37809827 http://dx.doi.org/10.1016/j.heliyon.2023.e19964 |
work_keys_str_mv | AT bekaradiguzelmeryem modelselectioninmultivariateadaptiveregressionssplinesmarsusingalternativeinformationcriteria AT cengizmehmetali modelselectioninmultivariateadaptiveregressionssplinesmarsusingalternativeinformationcriteria |