Cargando…

Review and evaluation of performance measures for survival prediction models in external validation settings

BACKGROUND: When developing a prediction model for survival data it is essential to validate its performance in external validation settings using appropriate performance measures. Although a number of such measures have been proposed, there is only limited guidance regarding their use in the contex...

Descripción completa

Detalles Bibliográficos
Autores principales:	Rahman, M. Shafiqur, Ambler, Gareth, Choodari-Oskooei, Babak, Omar, Rumana Z.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2017
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5395888/ https://www.ncbi.nlm.nih.gov/pubmed/28420338 http://dx.doi.org/10.1186/s12874-017-0336-2

_version_	1783229963359485952
author	Rahman, M. Shafiqur Ambler, Gareth Choodari-Oskooei, Babak Omar, Rumana Z.
author_facet	Rahman, M. Shafiqur Ambler, Gareth Choodari-Oskooei, Babak Omar, Rumana Z.
author_sort	Rahman, M. Shafiqur
collection	PubMed
description	BACKGROUND: When developing a prediction model for survival data it is essential to validate its performance in external validation settings using appropriate performance measures. Although a number of such measures have been proposed, there is only limited guidance regarding their use in the context of model validation. This paper reviewed and evaluated a wide range of performance measures to provide some guidelines for their use in practice. METHODS: An extensive simulation study based on two clinical datasets was conducted to investigate the performance of the measures in external validation settings. Measures were selected from categories that assess the overall performance, discrimination and calibration of a survival prediction model. Some of these have been modified to allow their use with validation data, and a case study is provided to describe how these measures can be estimated in practice. The measures were evaluated with respect to their robustness to censoring and ease of interpretation. All measures are implemented, or are straightforward to implement, in statistical software. RESULTS: Most of the performance measures were reasonably robust to moderate levels of censoring. One exception was Harrell’s concordance measure which tended to increase as censoring increased. CONCLUSIONS: We recommend that Uno’s concordance measure is used to quantify concordance when there are moderate levels of censoring. Alternatively, Gönen and Heller’s measure could be considered, especially if censoring is very high, but we suggest that the prediction model is re-calibrated first. We also recommend that Royston’s D is routinely reported to assess discrimination since it has an appealing interpretation. The calibration slope is useful for both internal and external validation settings and recommended to report routinely. Our recommendation would be to use any of the predictive accuracy measures and provide the corresponding predictive accuracy curves. In addition, we recommend to investigate the characteristics of the validation data such as the level of censoring and the distribution of the prognostic index derived in the validation setting before choosing the performance measures.
format	Online Article Text
id	pubmed-5395888
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-53958882017-04-20 Review and evaluation of performance measures for survival prediction models in external validation settings Rahman, M. Shafiqur Ambler, Gareth Choodari-Oskooei, Babak Omar, Rumana Z. BMC Med Res Methodol Research Article BACKGROUND: When developing a prediction model for survival data it is essential to validate its performance in external validation settings using appropriate performance measures. Although a number of such measures have been proposed, there is only limited guidance regarding their use in the context of model validation. This paper reviewed and evaluated a wide range of performance measures to provide some guidelines for their use in practice. METHODS: An extensive simulation study based on two clinical datasets was conducted to investigate the performance of the measures in external validation settings. Measures were selected from categories that assess the overall performance, discrimination and calibration of a survival prediction model. Some of these have been modified to allow their use with validation data, and a case study is provided to describe how these measures can be estimated in practice. The measures were evaluated with respect to their robustness to censoring and ease of interpretation. All measures are implemented, or are straightforward to implement, in statistical software. RESULTS: Most of the performance measures were reasonably robust to moderate levels of censoring. One exception was Harrell’s concordance measure which tended to increase as censoring increased. CONCLUSIONS: We recommend that Uno’s concordance measure is used to quantify concordance when there are moderate levels of censoring. Alternatively, Gönen and Heller’s measure could be considered, especially if censoring is very high, but we suggest that the prediction model is re-calibrated first. We also recommend that Royston’s D is routinely reported to assess discrimination since it has an appealing interpretation. The calibration slope is useful for both internal and external validation settings and recommended to report routinely. Our recommendation would be to use any of the predictive accuracy measures and provide the corresponding predictive accuracy curves. In addition, we recommend to investigate the characteristics of the validation data such as the level of censoring and the distribution of the prognostic index derived in the validation setting before choosing the performance measures. BioMed Central 2017-04-18 /pmc/articles/PMC5395888/ /pubmed/28420338 http://dx.doi.org/10.1186/s12874-017-0336-2 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article Rahman, M. Shafiqur Ambler, Gareth Choodari-Oskooei, Babak Omar, Rumana Z. Review and evaluation of performance measures for survival prediction models in external validation settings
title	Review and evaluation of performance measures for survival prediction models in external validation settings
title_full	Review and evaluation of performance measures for survival prediction models in external validation settings
title_fullStr	Review and evaluation of performance measures for survival prediction models in external validation settings
title_full_unstemmed	Review and evaluation of performance measures for survival prediction models in external validation settings
title_short	Review and evaluation of performance measures for survival prediction models in external validation settings
title_sort	review and evaluation of performance measures for survival prediction models in external validation settings
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5395888/ https://www.ncbi.nlm.nih.gov/pubmed/28420338 http://dx.doi.org/10.1186/s12874-017-0336-2
work_keys_str_mv	AT rahmanmshafiqur reviewandevaluationofperformancemeasuresforsurvivalpredictionmodelsinexternalvalidationsettings AT amblergareth reviewandevaluationofperformancemeasuresforsurvivalpredictionmodelsinexternalvalidationsettings AT choodarioskooeibabak reviewandevaluationofperformancemeasuresforsurvivalpredictionmodelsinexternalvalidationsettings AT omarrumanaz reviewandevaluationofperformancemeasuresforsurvivalpredictionmodelsinexternalvalidationsettings

Review and evaluation of performance measures for survival prediction models in external validation settings

Ejemplares similares