Cargando…

Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small

OBJECTIVES: When developing a clinical prediction model, penalization techniques are recommended to address overfitting, as they shrink predictor effect estimates toward the null and reduce mean-square prediction error in new individuals. However, shrinkage and penalty terms (‘tuning parameters’) ar...

Descripción completa

Detalles Bibliográficos
Autores principales:	Riley, Richard D., Snell, Kym I.E., Martin, Glen P., Whittle, Rebecca, Archer, Lucinda, Sperrin, Matthew, Collins, Gary S.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier 2021
Materias:	Original Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8026952/ https://www.ncbi.nlm.nih.gov/pubmed/33307188 http://dx.doi.org/10.1016/j.jclinepi.2020.12.005

_version_	1783675732994555904
author	Riley, Richard D. Snell, Kym I.E. Martin, Glen P. Whittle, Rebecca Archer, Lucinda Sperrin, Matthew Collins, Gary S.
author_facet	Riley, Richard D. Snell, Kym I.E. Martin, Glen P. Whittle, Rebecca Archer, Lucinda Sperrin, Matthew Collins, Gary S.
author_sort	Riley, Richard D.
collection	PubMed
description	OBJECTIVES: When developing a clinical prediction model, penalization techniques are recommended to address overfitting, as they shrink predictor effect estimates toward the null and reduce mean-square prediction error in new individuals. However, shrinkage and penalty terms (‘tuning parameters’) are estimated with uncertainty from the development data set. We examined the magnitude of this uncertainty and the subsequent impact on prediction model performance. STUDY DESIGN AND SETTING: This study comprises applied examples and a simulation study of the following methods: uniform shrinkage (estimated via a closed-form solution or bootstrapping), ridge regression, the lasso, and elastic net. RESULTS: In a particular model development data set, penalization methods can be unreliable because tuning parameters are estimated with large uncertainty. This is of most concern when development data sets have a small effective sample size and the model's Cox-Snell [Formula: see text] is low. The problem can lead to considerable miscalibration of model predictions in new individuals. CONCLUSION: Penalization methods are not a ‘carte blanche’; they do not guarantee a reliable prediction model is developed. They are more unreliable when needed most (i.e., when overfitting may be large). We recommend they are best applied with large effective sample sizes, as identified from recent sample size calculations that aim to minimize the potential for model overfitting and precisely estimate key parameters.
format	Online Article Text
id	pubmed-8026952
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Elsevier
record_format	MEDLINE/PubMed
spelling	pubmed-80269522021-04-13 Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small Riley, Richard D. Snell, Kym I.E. Martin, Glen P. Whittle, Rebecca Archer, Lucinda Sperrin, Matthew Collins, Gary S. J Clin Epidemiol Original Article OBJECTIVES: When developing a clinical prediction model, penalization techniques are recommended to address overfitting, as they shrink predictor effect estimates toward the null and reduce mean-square prediction error in new individuals. However, shrinkage and penalty terms (‘tuning parameters’) are estimated with uncertainty from the development data set. We examined the magnitude of this uncertainty and the subsequent impact on prediction model performance. STUDY DESIGN AND SETTING: This study comprises applied examples and a simulation study of the following methods: uniform shrinkage (estimated via a closed-form solution or bootstrapping), ridge regression, the lasso, and elastic net. RESULTS: In a particular model development data set, penalization methods can be unreliable because tuning parameters are estimated with large uncertainty. This is of most concern when development data sets have a small effective sample size and the model's Cox-Snell [Formula: see text] is low. The problem can lead to considerable miscalibration of model predictions in new individuals. CONCLUSION: Penalization methods are not a ‘carte blanche’; they do not guarantee a reliable prediction model is developed. They are more unreliable when needed most (i.e., when overfitting may be large). We recommend they are best applied with large effective sample sizes, as identified from recent sample size calculations that aim to minimize the potential for model overfitting and precisely estimate key parameters. Elsevier 2021-04 /pmc/articles/PMC8026952/ /pubmed/33307188 http://dx.doi.org/10.1016/j.jclinepi.2020.12.005 Text en © 2021 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Original Article Riley, Richard D. Snell, Kym I.E. Martin, Glen P. Whittle, Rebecca Archer, Lucinda Sperrin, Matthew Collins, Gary S. Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small
title	Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small
title_full	Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small
title_fullStr	Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small
title_full_unstemmed	Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small
title_short	Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small
title_sort	penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small
topic	Original Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8026952/ https://www.ncbi.nlm.nih.gov/pubmed/33307188 http://dx.doi.org/10.1016/j.jclinepi.2020.12.005
work_keys_str_mv	AT rileyrichardd penalizationandshrinkagemethodsproducedunreliableclinicalpredictionmodelsespeciallywhensamplesizewassmall AT snellkymie penalizationandshrinkagemethodsproducedunreliableclinicalpredictionmodelsespeciallywhensamplesizewassmall AT martinglenp penalizationandshrinkagemethodsproducedunreliableclinicalpredictionmodelsespeciallywhensamplesizewassmall AT whittlerebecca penalizationandshrinkagemethodsproducedunreliableclinicalpredictionmodelsespeciallywhensamplesizewassmall AT archerlucinda penalizationandshrinkagemethodsproducedunreliableclinicalpredictionmodelsespeciallywhensamplesizewassmall AT sperrinmatthew penalizationandshrinkagemethodsproducedunreliableclinicalpredictionmodelsespeciallywhensamplesizewassmall AT collinsgarys penalizationandshrinkagemethodsproducedunreliableclinicalpredictionmodelsespeciallywhensamplesizewassmall

Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small

Ejemplares similares