Cargando…

Regression with Highly Correlated Predictors: Variable Omission Is Not the Solution

Regression models have been in use for decades to explore and quantify the association between a dependent response and several independent variables in environmental sciences, epidemiology and public health. However, researchers often encounter situations in which some independent variables exhibit...

Descripción completa

Detalles Bibliográficos
Autores principales: Gregorich, Mariella, Strohmaier, Susanne, Dunkler, Daniela, Heinze, Georg
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8073086/
https://www.ncbi.nlm.nih.gov/pubmed/33920501
http://dx.doi.org/10.3390/ijerph18084259
_version_ 1783684052756201472
author Gregorich, Mariella
Strohmaier, Susanne
Dunkler, Daniela
Heinze, Georg
author_facet Gregorich, Mariella
Strohmaier, Susanne
Dunkler, Daniela
Heinze, Georg
author_sort Gregorich, Mariella
collection PubMed
description Regression models have been in use for decades to explore and quantify the association between a dependent response and several independent variables in environmental sciences, epidemiology and public health. However, researchers often encounter situations in which some independent variables exhibit high bivariate correlation, or may even be collinear. Improper statistical handling of this situation will most certainly generate models of little or no practical use and misleading interpretations. By means of two example studies, we demonstrate how diagnostic tools for collinearity or near-collinearity may fail in guiding the analyst. Instead, the most appropriate way of handling collinearity should be driven by the research question at hand and, in particular, by the distinction between predictive or explanatory aims.
format Online
Article
Text
id pubmed-8073086
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-80730862021-04-27 Regression with Highly Correlated Predictors: Variable Omission Is Not the Solution Gregorich, Mariella Strohmaier, Susanne Dunkler, Daniela Heinze, Georg Int J Environ Res Public Health Article Regression models have been in use for decades to explore and quantify the association between a dependent response and several independent variables in environmental sciences, epidemiology and public health. However, researchers often encounter situations in which some independent variables exhibit high bivariate correlation, or may even be collinear. Improper statistical handling of this situation will most certainly generate models of little or no practical use and misleading interpretations. By means of two example studies, we demonstrate how diagnostic tools for collinearity or near-collinearity may fail in guiding the analyst. Instead, the most appropriate way of handling collinearity should be driven by the research question at hand and, in particular, by the distinction between predictive or explanatory aims. MDPI 2021-04-17 /pmc/articles/PMC8073086/ /pubmed/33920501 http://dx.doi.org/10.3390/ijerph18084259 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Gregorich, Mariella
Strohmaier, Susanne
Dunkler, Daniela
Heinze, Georg
Regression with Highly Correlated Predictors: Variable Omission Is Not the Solution
title Regression with Highly Correlated Predictors: Variable Omission Is Not the Solution
title_full Regression with Highly Correlated Predictors: Variable Omission Is Not the Solution
title_fullStr Regression with Highly Correlated Predictors: Variable Omission Is Not the Solution
title_full_unstemmed Regression with Highly Correlated Predictors: Variable Omission Is Not the Solution
title_short Regression with Highly Correlated Predictors: Variable Omission Is Not the Solution
title_sort regression with highly correlated predictors: variable omission is not the solution
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8073086/
https://www.ncbi.nlm.nih.gov/pubmed/33920501
http://dx.doi.org/10.3390/ijerph18084259
work_keys_str_mv AT gregorichmariella regressionwithhighlycorrelatedpredictorsvariableomissionisnotthesolution
AT strohmaiersusanne regressionwithhighlycorrelatedpredictorsvariableomissionisnotthesolution
AT dunklerdaniela regressionwithhighlycorrelatedpredictorsvariableomissionisnotthesolution
AT heinzegeorg regressionwithhighlycorrelatedpredictorsvariableomissionisnotthesolution