Cargando…

Model selection in Medical Research: A simulation study comparing Bayesian Model Averaging and Stepwise Regression

BACKGROUND: Automatic variable selection methods are usually discouraged in medical research although we believe they might be valuable for studies where subject matter knowledge is limited. Bayesian model averaging may be useful for model selection but only limited attempts to compare it to stepwis...

Descripción completa

Detalles Bibliográficos
Autores principales:	Genell, Anna, Nemes, Szilard, Steineck, Gunnar, Dickman, Paul W
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2010
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3017523/ https://www.ncbi.nlm.nih.gov/pubmed/21134252 http://dx.doi.org/10.1186/1471-2288-10-108

_version_	1782195906800517120
author	Genell, Anna Nemes, Szilard Steineck, Gunnar Dickman, Paul W
author_facet	Genell, Anna Nemes, Szilard Steineck, Gunnar Dickman, Paul W
author_sort	Genell, Anna
collection	PubMed
description	BACKGROUND: Automatic variable selection methods are usually discouraged in medical research although we believe they might be valuable for studies where subject matter knowledge is limited. Bayesian model averaging may be useful for model selection but only limited attempts to compare it to stepwise regression have been published. We therefore performed a simulation study to compare stepwise regression with Bayesian model averaging. METHODS: We simulated data corresponding to five different data generating processes and thirty different values of the effect size (the parameter estimate divided by its standard error). Each data generating process contained twenty explanatory variables in total and had between zero and two true predictors. Three data generating processes were built of uncorrelated predictor variables while two had a mixture of correlated and uncorrelated variables. We fitted linear regression models to the simulated data. We used Bayesian model averaging and stepwise regression respectively as model selection procedures and compared the estimated selection probabilities. RESULTS: The estimated probability of not selecting a redundant variable was between 0.99 and 1 for Bayesian model averaging while approximately 0.95 for stepwise regression when the redundant variable was not correlated with a true predictor. These probabilities did not depend on the effect size of the true predictor. In the case of correlation between a redundant variable and a true predictor, the probability of not selecting a redundant variable was 0.95 to 1 for Bayesian model averaging while for stepwise regression it was between 0.7 and 0.9, depending on the effect size of the true predictor. The probability of selecting a true predictor increased as the effect size of the true predictor increased and leveled out at between 0.9 and 1 for stepwise regression, while it leveled out at 1 for Bayesian model averaging. CONCLUSIONS: Our simulation study showed that under the given conditions, Bayesian model averaging had a higher probability of not selecting a redundant variable than stepwise regression and had a similar probability of selecting a true predictor. Medical researchers building regression models with limited subject matter knowledge could thus benefit from using Bayesian model averaging.
format	Text
id	pubmed-3017523
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-30175232011-01-10 Model selection in Medical Research: A simulation study comparing Bayesian Model Averaging and Stepwise Regression Genell, Anna Nemes, Szilard Steineck, Gunnar Dickman, Paul W BMC Med Res Methodol Research Article BACKGROUND: Automatic variable selection methods are usually discouraged in medical research although we believe they might be valuable for studies where subject matter knowledge is limited. Bayesian model averaging may be useful for model selection but only limited attempts to compare it to stepwise regression have been published. We therefore performed a simulation study to compare stepwise regression with Bayesian model averaging. METHODS: We simulated data corresponding to five different data generating processes and thirty different values of the effect size (the parameter estimate divided by its standard error). Each data generating process contained twenty explanatory variables in total and had between zero and two true predictors. Three data generating processes were built of uncorrelated predictor variables while two had a mixture of correlated and uncorrelated variables. We fitted linear regression models to the simulated data. We used Bayesian model averaging and stepwise regression respectively as model selection procedures and compared the estimated selection probabilities. RESULTS: The estimated probability of not selecting a redundant variable was between 0.99 and 1 for Bayesian model averaging while approximately 0.95 for stepwise regression when the redundant variable was not correlated with a true predictor. These probabilities did not depend on the effect size of the true predictor. In the case of correlation between a redundant variable and a true predictor, the probability of not selecting a redundant variable was 0.95 to 1 for Bayesian model averaging while for stepwise regression it was between 0.7 and 0.9, depending on the effect size of the true predictor. The probability of selecting a true predictor increased as the effect size of the true predictor increased and leveled out at between 0.9 and 1 for stepwise regression, while it leveled out at 1 for Bayesian model averaging. CONCLUSIONS: Our simulation study showed that under the given conditions, Bayesian model averaging had a higher probability of not selecting a redundant variable than stepwise regression and had a similar probability of selecting a true predictor. Medical researchers building regression models with limited subject matter knowledge could thus benefit from using Bayesian model averaging. BioMed Central 2010-12-06 /pmc/articles/PMC3017523/ /pubmed/21134252 http://dx.doi.org/10.1186/1471-2288-10-108 Text en Copyright ©2010 Genell et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Genell, Anna Nemes, Szilard Steineck, Gunnar Dickman, Paul W Model selection in Medical Research: A simulation study comparing Bayesian Model Averaging and Stepwise Regression
title	Model selection in Medical Research: A simulation study comparing Bayesian Model Averaging and Stepwise Regression
title_full	Model selection in Medical Research: A simulation study comparing Bayesian Model Averaging and Stepwise Regression
title_fullStr	Model selection in Medical Research: A simulation study comparing Bayesian Model Averaging and Stepwise Regression
title_full_unstemmed	Model selection in Medical Research: A simulation study comparing Bayesian Model Averaging and Stepwise Regression
title_short	Model selection in Medical Research: A simulation study comparing Bayesian Model Averaging and Stepwise Regression
title_sort	model selection in medical research: a simulation study comparing bayesian model averaging and stepwise regression
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3017523/ https://www.ncbi.nlm.nih.gov/pubmed/21134252 http://dx.doi.org/10.1186/1471-2288-10-108
work_keys_str_mv	AT genellanna modelselectioninmedicalresearchasimulationstudycomparingbayesianmodelaveragingandstepwiseregression AT nemesszilard modelselectioninmedicalresearchasimulationstudycomparingbayesianmodelaveragingandstepwiseregression AT steineckgunnar modelselectioninmedicalresearchasimulationstudycomparingbayesianmodelaveragingandstepwiseregression AT dickmanpaulw modelselectioninmedicalresearchasimulationstudycomparingbayesianmodelaveragingandstepwiseregression

Model selection in Medical Research: A simulation study comparing Bayesian Model Averaging and Stepwise Regression

Ejemplares similares