Cargando…
A simple function for full‐subsets multiple regression in ecology with R
Full‐subsets information theoretic approaches are becoming an increasingly popular tool for exploring predictive power and variable importance where a wide range of candidate predictors are being considered. Here, we describe a simple function in the statistical programming language R that can be us...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6024142/ https://www.ncbi.nlm.nih.gov/pubmed/29988441 http://dx.doi.org/10.1002/ece3.4134 |
_version_ | 1783336004856315904 |
---|---|
author | Fisher, Rebecca Wilson, Shaun K. Sin, Tsai M. Lee, Ai C. Langlois, Tim J. |
author_facet | Fisher, Rebecca Wilson, Shaun K. Sin, Tsai M. Lee, Ai C. Langlois, Tim J. |
author_sort | Fisher, Rebecca |
collection | PubMed |
description | Full‐subsets information theoretic approaches are becoming an increasingly popular tool for exploring predictive power and variable importance where a wide range of candidate predictors are being considered. Here, we describe a simple function in the statistical programming language R that can be used to construct, fit, and compare a complete model set of possible ecological or environmental predictors, given a response variable of interest and a starting generalized additive (mixed) model fit. Main advantages include not requiring a complete model to be fit as the starting point for candidate model set construction (meaning that a greater number of predictors can potentially be explored than might be available through functions such as dredge); model sets that include interactions between factors and continuous nonlinear predictors; and automatic removal of models with correlated predictors (based on a user defined criterion for exclusion). The function takes continuous predictors, which are fitted using smoothers via either gam, gamm (mgcv) or gamm4, as well as factor variables which are included on their own or as two‐level interaction terms within the gam smooth (via use of the “by” argument), or with themselves. The function allows any model to be constructed and used as a null model, and takes a range of arguments that allow control over the model set being constructed, including specifying cyclic and linear continuous predictors, specification of the smoothing algorithm used, and the maximum complexity allowed for smooth terms. The use of the function is demonstrated via case studies that highlight how appropriate model sets can be easily constructed and the broader utility of the approach for exploratory ecology. |
format | Online Article Text |
id | pubmed-6024142 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-60241422018-07-09 A simple function for full‐subsets multiple regression in ecology with R Fisher, Rebecca Wilson, Shaun K. Sin, Tsai M. Lee, Ai C. Langlois, Tim J. Ecol Evol Original Research Full‐subsets information theoretic approaches are becoming an increasingly popular tool for exploring predictive power and variable importance where a wide range of candidate predictors are being considered. Here, we describe a simple function in the statistical programming language R that can be used to construct, fit, and compare a complete model set of possible ecological or environmental predictors, given a response variable of interest and a starting generalized additive (mixed) model fit. Main advantages include not requiring a complete model to be fit as the starting point for candidate model set construction (meaning that a greater number of predictors can potentially be explored than might be available through functions such as dredge); model sets that include interactions between factors and continuous nonlinear predictors; and automatic removal of models with correlated predictors (based on a user defined criterion for exclusion). The function takes continuous predictors, which are fitted using smoothers via either gam, gamm (mgcv) or gamm4, as well as factor variables which are included on their own or as two‐level interaction terms within the gam smooth (via use of the “by” argument), or with themselves. The function allows any model to be constructed and used as a null model, and takes a range of arguments that allow control over the model set being constructed, including specifying cyclic and linear continuous predictors, specification of the smoothing algorithm used, and the maximum complexity allowed for smooth terms. The use of the function is demonstrated via case studies that highlight how appropriate model sets can be easily constructed and the broader utility of the approach for exploratory ecology. John Wiley and Sons Inc. 2018-05-20 /pmc/articles/PMC6024142/ /pubmed/29988441 http://dx.doi.org/10.1002/ece3.4134 Text en © 2018 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Research Fisher, Rebecca Wilson, Shaun K. Sin, Tsai M. Lee, Ai C. Langlois, Tim J. A simple function for full‐subsets multiple regression in ecology with R |
title | A simple function for full‐subsets multiple regression in ecology with R |
title_full | A simple function for full‐subsets multiple regression in ecology with R |
title_fullStr | A simple function for full‐subsets multiple regression in ecology with R |
title_full_unstemmed | A simple function for full‐subsets multiple regression in ecology with R |
title_short | A simple function for full‐subsets multiple regression in ecology with R |
title_sort | simple function for full‐subsets multiple regression in ecology with r |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6024142/ https://www.ncbi.nlm.nih.gov/pubmed/29988441 http://dx.doi.org/10.1002/ece3.4134 |
work_keys_str_mv | AT fisherrebecca asimplefunctionforfullsubsetsmultipleregressioninecologywithr AT wilsonshaunk asimplefunctionforfullsubsetsmultipleregressioninecologywithr AT sintsaim asimplefunctionforfullsubsetsmultipleregressioninecologywithr AT leeaic asimplefunctionforfullsubsetsmultipleregressioninecologywithr AT langloistimj asimplefunctionforfullsubsetsmultipleregressioninecologywithr AT fisherrebecca simplefunctionforfullsubsetsmultipleregressioninecologywithr AT wilsonshaunk simplefunctionforfullsubsetsmultipleregressioninecologywithr AT sintsaim simplefunctionforfullsubsetsmultipleregressioninecologywithr AT leeaic simplefunctionforfullsubsetsmultipleregressioninecologywithr AT langloistimj simplefunctionforfullsubsetsmultipleregressioninecologywithr |