Cargando…

Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data

BACKGROUND: Prognostic studies of time-to-event data, where researchers aim to develop or validate multivariable prognostic models in order to predict survival, are commonly seen in the medical literature; however, most are performed retrospectively and few consider sample size prior to analysis. Ev...

Descripción completa

Detalles Bibliográficos
Autores principales: Jinks, Rachel C., Royston, Patrick, Parmar, Mahesh KB
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4603804/
https://www.ncbi.nlm.nih.gov/pubmed/26459415
http://dx.doi.org/10.1186/s12874-015-0078-y
_version_ 1782394961433460736
author Jinks, Rachel C.
Royston, Patrick
Parmar, Mahesh KB
author_facet Jinks, Rachel C.
Royston, Patrick
Parmar, Mahesh KB
author_sort Jinks, Rachel C.
collection PubMed
description BACKGROUND: Prognostic studies of time-to-event data, where researchers aim to develop or validate multivariable prognostic models in order to predict survival, are commonly seen in the medical literature; however, most are performed retrospectively and few consider sample size prior to analysis. Events per variable rules are sometimes cited, but these are based on bias and coverage of confidence intervals for model terms, which are not of primary interest when developing a model to predict outcome. In this paper we aim to develop sample size recommendations for multivariable models of time-to-event data, based on their prognostic ability. METHODS: We derive formulae for determining the sample size required for multivariable prognostic models in time-to-event data, based on a measure of discrimination, D, developed by Royston and Sauerbrei. These formulae fall into two categories: either based on the significance of the value of D in a new study compared to a previous estimate, or based on the precision of the estimate of D in a new study in terms of confidence interval width. Using simulation we show that they give the desired power and type I error and are not affected by random censoring. Additionally, we conduct a literature review to collate published values of D in different disease areas. RESULTS: We illustrate our methods using parameters from a published prognostic study in liver cancer. The resulting sample sizes can be large, and we suggest controlling study size by expressing the desired accuracy in the new study as a relative value as well as an absolute value. To improve usability we use the values of D obtained from the literature review to develop an equation to approximately convert the commonly reported Harrell’s c-index to D. A flow chart is provided to aid decision making when using these methods. CONCLUSION: We have developed a suite of sample size calculations based on the prognostic ability of a survival model, rather than the magnitude or significance of model coefficients. We have taken care to develop the practical utility of the calculations and give recommendations for their use in contemporary clinical research.
format Online
Article
Text
id pubmed-4603804
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46038042015-10-14 Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data Jinks, Rachel C. Royston, Patrick Parmar, Mahesh KB BMC Med Res Methodol Research Article BACKGROUND: Prognostic studies of time-to-event data, where researchers aim to develop or validate multivariable prognostic models in order to predict survival, are commonly seen in the medical literature; however, most are performed retrospectively and few consider sample size prior to analysis. Events per variable rules are sometimes cited, but these are based on bias and coverage of confidence intervals for model terms, which are not of primary interest when developing a model to predict outcome. In this paper we aim to develop sample size recommendations for multivariable models of time-to-event data, based on their prognostic ability. METHODS: We derive formulae for determining the sample size required for multivariable prognostic models in time-to-event data, based on a measure of discrimination, D, developed by Royston and Sauerbrei. These formulae fall into two categories: either based on the significance of the value of D in a new study compared to a previous estimate, or based on the precision of the estimate of D in a new study in terms of confidence interval width. Using simulation we show that they give the desired power and type I error and are not affected by random censoring. Additionally, we conduct a literature review to collate published values of D in different disease areas. RESULTS: We illustrate our methods using parameters from a published prognostic study in liver cancer. The resulting sample sizes can be large, and we suggest controlling study size by expressing the desired accuracy in the new study as a relative value as well as an absolute value. To improve usability we use the values of D obtained from the literature review to develop an equation to approximately convert the commonly reported Harrell’s c-index to D. A flow chart is provided to aid decision making when using these methods. CONCLUSION: We have developed a suite of sample size calculations based on the prognostic ability of a survival model, rather than the magnitude or significance of model coefficients. We have taken care to develop the practical utility of the calculations and give recommendations for their use in contemporary clinical research. BioMed Central 2015-10-12 /pmc/articles/PMC4603804/ /pubmed/26459415 http://dx.doi.org/10.1186/s12874-015-0078-y Text en © Jinks et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Jinks, Rachel C.
Royston, Patrick
Parmar, Mahesh KB
Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data
title Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data
title_full Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data
title_fullStr Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data
title_full_unstemmed Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data
title_short Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data
title_sort discrimination-based sample size calculations for multivariable prognostic models for time-to-event data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4603804/
https://www.ncbi.nlm.nih.gov/pubmed/26459415
http://dx.doi.org/10.1186/s12874-015-0078-y
work_keys_str_mv AT jinksrachelc discriminationbasedsamplesizecalculationsformultivariableprognosticmodelsfortimetoeventdata
AT roystonpatrick discriminationbasedsamplesizecalculationsformultivariableprognosticmodelsfortimetoeventdata
AT parmarmaheshkb discriminationbasedsamplesizecalculationsformultivariableprognosticmodelsfortimetoeventdata