Cargando…

False-positive and false-negative risks for individual multicentre trials in critical care

BACKGROUND: In medical research, null hypothesis significance testing (NHST) is the dominant framework for statistical inference. NHST involves calculating P-values and confidence intervals to quantify the evidence against the null hypothesis of no effect. However, P-values and confidence intervals...

Descripción completa

Detalles Bibliográficos
Autores principales: Sidebotham, David, Barlow, C. Jake
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10430847/
https://www.ncbi.nlm.nih.gov/pubmed/37588693
http://dx.doi.org/10.1016/j.bjao.2022.100003
Descripción
Sumario:BACKGROUND: In medical research, null hypothesis significance testing (NHST) is the dominant framework for statistical inference. NHST involves calculating P-values and confidence intervals to quantify the evidence against the null hypothesis of no effect. However, P-values and confidence intervals cannot tell us the probability that the hypothesis is true. In contrast, false-positive risk (FPR) and false-negative risk (FNR) are post-test probabilities concerning the truth of the hypothesis, that is to say, the probability a real effect exists. METHODS: We calculated the FPR or FNR for 53 individual multicentre trials in critical care based on a pretest probability of 0.5 that the hypothesis was true. RESULTS: For trials reporting statistical significance, the FPR varied between 0.1% and 57.6%. For trials reporting non-significance, the FNR varied between 1.7% and 36.9%. Twenty-six of 47 trials (55.3%) reporting non-significance provided strong or very strong evidence in favour of the null hypothesis; the remaining trials provided limited evidence. There was no obvious relationship between the P-value and the FNR. CONCLUSIONS: The FPR and FNR showed marked variability, indicating that the probability of a real or absent treatment effect differed substantially between trials. Only one trial reporting statistical significance provided convincing evidence of a real treatment effect, and nearly half of all trials reporting non-significance provided limited evidence for the absence of a treatment effect. Our findings suggest that the quality of evidence from multicentre trials in critical care is highly variable.