Cargando…

Improving prediction models with new markers: a comparison of updating strategies

BACKGROUND: New markers hold the promise of improving risk prediction for individual patients. We aimed to compare the performance of different strategies to extend a previously developed prediction model with a new marker. METHODS: Our motivating example was the extension of a risk calculator for p...

Descripción completa

Detalles Bibliográficos
Autores principales:	Nieboer, D., Vergouwe, Y., Ankerst, Danna P., Roobol, Monique J., Steyerberg, Ewout W.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2016
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5039804/ https://www.ncbi.nlm.nih.gov/pubmed/27678479 http://dx.doi.org/10.1186/s12874-016-0231-2

_version_	1782456134225887232
author	Nieboer, D. Vergouwe, Y. Ankerst, Danna P. Roobol, Monique J. Steyerberg, Ewout W.
author_facet	Nieboer, D. Vergouwe, Y. Ankerst, Danna P. Roobol, Monique J. Steyerberg, Ewout W.
author_sort	Nieboer, D.
collection	PubMed
description	BACKGROUND: New markers hold the promise of improving risk prediction for individual patients. We aimed to compare the performance of different strategies to extend a previously developed prediction model with a new marker. METHODS: Our motivating example was the extension of a risk calculator for prostate cancer with a new marker that was available in a relatively small dataset. Performance of the strategies was also investigated in simulations. Development, marker and test sets with different sample sizes originating from the same underlying population were generated. A prediction model was fitted using logistic regression in the development set, extended using the marker set and validated in the test set. Extension strategies considered were re-estimating individual regression coefficients, updating of predictions using conditional likelihood ratios (LR) and imputation of marker values in the development set and subsequently fitting a model in the combined development and marker sets. Sample sizes considered for the development and marker set were 500 and 100, 500 and 500, and 100 and 500 patients. Discriminative ability of the extended models was quantified using the concordance statistic (c-statistic) and calibration was quantified using the calibration slope. RESULTS: All strategies led to extended models with increased discrimination (c-statistic increase from 0.75 to 0.80 in test sets). Strategies estimating a large number of parameters (re-estimation of all coefficients and updating using conditional LR) led to overfitting (calibration slope below 1). Parsimonious methods, limiting the number of coefficients to be re-estimated, or applying shrinkage after model revision, limited the amount of overfitting. Combining the development and marker set using imputation of missing marker values approach led to consistently good performing models in all scenarios. Similar results were observed in the motivating example. CONCLUSION: When the sample with the new marker information is small, parsimonious methods are required to prevent overfitting of a new prediction model. Combining all data with imputation of missing marker values is an attractive option, even if a relatively large marker data set is available. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12874-016-0231-2) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5039804
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-50398042016-10-05 Improving prediction models with new markers: a comparison of updating strategies Nieboer, D. Vergouwe, Y. Ankerst, Danna P. Roobol, Monique J. Steyerberg, Ewout W. BMC Med Res Methodol Research Article BACKGROUND: New markers hold the promise of improving risk prediction for individual patients. We aimed to compare the performance of different strategies to extend a previously developed prediction model with a new marker. METHODS: Our motivating example was the extension of a risk calculator for prostate cancer with a new marker that was available in a relatively small dataset. Performance of the strategies was also investigated in simulations. Development, marker and test sets with different sample sizes originating from the same underlying population were generated. A prediction model was fitted using logistic regression in the development set, extended using the marker set and validated in the test set. Extension strategies considered were re-estimating individual regression coefficients, updating of predictions using conditional likelihood ratios (LR) and imputation of marker values in the development set and subsequently fitting a model in the combined development and marker sets. Sample sizes considered for the development and marker set were 500 and 100, 500 and 500, and 100 and 500 patients. Discriminative ability of the extended models was quantified using the concordance statistic (c-statistic) and calibration was quantified using the calibration slope. RESULTS: All strategies led to extended models with increased discrimination (c-statistic increase from 0.75 to 0.80 in test sets). Strategies estimating a large number of parameters (re-estimation of all coefficients and updating using conditional LR) led to overfitting (calibration slope below 1). Parsimonious methods, limiting the number of coefficients to be re-estimated, or applying shrinkage after model revision, limited the amount of overfitting. Combining the development and marker set using imputation of missing marker values approach led to consistently good performing models in all scenarios. Similar results were observed in the motivating example. CONCLUSION: When the sample with the new marker information is small, parsimonious methods are required to prevent overfitting of a new prediction model. Combining all data with imputation of missing marker values is an attractive option, even if a relatively large marker data set is available. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12874-016-0231-2) contains supplementary material, which is available to authorized users. BioMed Central 2016-09-27 /pmc/articles/PMC5039804/ /pubmed/27678479 http://dx.doi.org/10.1186/s12874-016-0231-2 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article Nieboer, D. Vergouwe, Y. Ankerst, Danna P. Roobol, Monique J. Steyerberg, Ewout W. Improving prediction models with new markers: a comparison of updating strategies
title	Improving prediction models with new markers: a comparison of updating strategies
title_full	Improving prediction models with new markers: a comparison of updating strategies
title_fullStr	Improving prediction models with new markers: a comparison of updating strategies
title_full_unstemmed	Improving prediction models with new markers: a comparison of updating strategies
title_short	Improving prediction models with new markers: a comparison of updating strategies
title_sort	improving prediction models with new markers: a comparison of updating strategies
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5039804/ https://www.ncbi.nlm.nih.gov/pubmed/27678479 http://dx.doi.org/10.1186/s12874-016-0231-2
work_keys_str_mv	AT nieboerd improvingpredictionmodelswithnewmarkersacomparisonofupdatingstrategies AT vergouwey improvingpredictionmodelswithnewmarkersacomparisonofupdatingstrategies AT ankerstdannap improvingpredictionmodelswithnewmarkersacomparisonofupdatingstrategies AT roobolmoniquej improvingpredictionmodelswithnewmarkersacomparisonofupdatingstrategies AT steyerbergewoutw improvingpredictionmodelswithnewmarkersacomparisonofupdatingstrategies

Improving prediction models with new markers: a comparison of updating strategies

Ejemplares similares