Cargando…

Overestimation of benefit when clinical trials stop early: a simulation study

BACKGROUND: Stopping trials early because of a favourable interim analysis can exaggerate benefit. This study simulated trials typical of those stopping early for benefit in the real world and estimated the degree to which early stopping likely overestimates benefit. METHODS: From 1 million simulate...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Sharon, Garrison, Scott R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9446780/
https://www.ncbi.nlm.nih.gov/pubmed/36064448
http://dx.doi.org/10.1186/s13063-022-06689-9
Descripción
Sumario:BACKGROUND: Stopping trials early because of a favourable interim analysis can exaggerate benefit. This study simulated trials typical of those stopping early for benefit in the real world and estimated the degree to which early stopping likely overestimates benefit. METHODS: From 1 million simulated trials, we selected those trials that exceeded interim stopping criteria, and compared apparent benefit when stopped with the true benefit used to generate the data. Each simulation randomly assigned period of observation, number of subjects, and control event rate using normal distributions centred on the same parameters in a template trial typical of real-world “truncated” (i.e. stopped for benefit) trials. The intervention’s true relative risk reduction (RRR) was also randomized, and assumed 1% of drugs have a warfarin-like effect (60% RRR), 5% a statin-like effect (35% RRR), 39% an ASA-like effect (15% RRR), 50% no effect (0% RRR), and that 5% would cause harm (modelled as a 20% relative risk increase). Trials had a single interim analysis and a z-value for stopping of 2.782 (O’Brien-Fleming threshold). We also modelled (1) a large truncated trial based on the SPRINT blood pressure trial (using SPRINT’s parameters and stopping criteria) and (2) the same typical truncated trials if they instead went to completion as planned with no interim analysis. RESULTS: For typical truncated trials, the true RRR was roughly 2/3 the observed RRR at the time of stopping. RRR was overestimated by an absolute 14.9% (median, IQR 6.4–24.6) in typical truncated trials, by 5.3% (IQR −0.1 to 11.4) in the same trials if instead carried to completion, and by 2.3% (IQR 0.98–1.09) in large SPRINT-like trials. For all models, to keep the absolute RRR overestimate below 5%, 250 events were required. CONCLUSION: Simulated trials typical of those stopping early for benefit overestimate the true relative risk reduction by roughly 50% (i.e. the true RRR was 2/3 of the observed value). Overestimation was much smaller, and likely unimportant, when simulating large SPRINT-like trials stopping early. Whether trials were large or small, stopped early or not, a minimum 250 events were needed to avoid overestimating relative risk reduction by an absolute 5% or more. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13063-022-06689-9.