Cargando…

Interpreting observational studies: why empirical calibration is needed to correct p-values

Often the literature makes assertions of medical product effects on the basis of ‘ p < 0.05’. The underlying premise is that at this threshold, there is only a 5% probability that the observed effect would be seen by chance when in reality there is no effect. In observational studies, much more t...

Descripción completa

Detalles Bibliográficos
Autores principales: Schuemie, Martijn J, Ryan, Patrick B, DuMouchel, William, Suchard, Marc A, Madigan, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BlackWell Publishing Ltd 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4285234/
https://www.ncbi.nlm.nih.gov/pubmed/23900808
http://dx.doi.org/10.1002/sim.5925
_version_ 1782351551389499392
author Schuemie, Martijn J
Ryan, Patrick B
DuMouchel, William
Suchard, Marc A
Madigan, David
author_facet Schuemie, Martijn J
Ryan, Patrick B
DuMouchel, William
Suchard, Marc A
Madigan, David
author_sort Schuemie, Martijn J
collection PubMed
description Often the literature makes assertions of medical product effects on the basis of ‘ p < 0.05’. The underlying premise is that at this threshold, there is only a 5% probability that the observed effect would be seen by chance when in reality there is no effect. In observational studies, much more than in randomized trials, bias and confounding may undermine this premise. To test this premise, we selected three exemplar drug safety studies from literature, representing a case–control, a cohort, and a self-controlled case series design. We attempted to replicate these studies as best we could for the drugs studied in the original articles. Next, we applied the same three designs to sets of negative controls: drugs that are not believed to cause the outcome of interest. We observed how often p < 0.05 when the null hypothesis is true, and we fitted distributions to the effect estimates. Using these distributions, we compute calibrated p-values that reflect the probability of observing the effect estimate under the null hypothesis, taking both random and systematic error into account. An automated analysis of scientific literature was performed to evaluate the potential impact of such a calibration. Our experiment provides evidence that the majority of observational studies would declare statistical significance when no effect is present. Empirical calibration was found to reduce spurious results to the desired 5% level. Applying these adjustments to literature suggests that at least 54% of findings with p < 0.05 are not actually statistically significant and should be reevaluated. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
format Online
Article
Text
id pubmed-4285234
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BlackWell Publishing Ltd
record_format MEDLINE/PubMed
spelling pubmed-42852342015-01-26 Interpreting observational studies: why empirical calibration is needed to correct p-values Schuemie, Martijn J Ryan, Patrick B DuMouchel, William Suchard, Marc A Madigan, David Stat Med Research Articles Often the literature makes assertions of medical product effects on the basis of ‘ p < 0.05’. The underlying premise is that at this threshold, there is only a 5% probability that the observed effect would be seen by chance when in reality there is no effect. In observational studies, much more than in randomized trials, bias and confounding may undermine this premise. To test this premise, we selected three exemplar drug safety studies from literature, representing a case–control, a cohort, and a self-controlled case series design. We attempted to replicate these studies as best we could for the drugs studied in the original articles. Next, we applied the same three designs to sets of negative controls: drugs that are not believed to cause the outcome of interest. We observed how often p < 0.05 when the null hypothesis is true, and we fitted distributions to the effect estimates. Using these distributions, we compute calibrated p-values that reflect the probability of observing the effect estimate under the null hypothesis, taking both random and systematic error into account. An automated analysis of scientific literature was performed to evaluate the potential impact of such a calibration. Our experiment provides evidence that the majority of observational studies would declare statistical significance when no effect is present. Empirical calibration was found to reduce spurious results to the desired 5% level. Applying these adjustments to literature suggests that at least 54% of findings with p < 0.05 are not actually statistically significant and should be reevaluated. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. BlackWell Publishing Ltd 2014-01-30 2013-07-30 /pmc/articles/PMC4285234/ /pubmed/23900808 http://dx.doi.org/10.1002/sim.5925 Text en © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. http://creativecommons.org/licenses/by-nc/3.0/ This is an open access article under the terms of the Creative Commons Attribution-NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle Research Articles
Schuemie, Martijn J
Ryan, Patrick B
DuMouchel, William
Suchard, Marc A
Madigan, David
Interpreting observational studies: why empirical calibration is needed to correct p-values
title Interpreting observational studies: why empirical calibration is needed to correct p-values
title_full Interpreting observational studies: why empirical calibration is needed to correct p-values
title_fullStr Interpreting observational studies: why empirical calibration is needed to correct p-values
title_full_unstemmed Interpreting observational studies: why empirical calibration is needed to correct p-values
title_short Interpreting observational studies: why empirical calibration is needed to correct p-values
title_sort interpreting observational studies: why empirical calibration is needed to correct p-values
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4285234/
https://www.ncbi.nlm.nih.gov/pubmed/23900808
http://dx.doi.org/10.1002/sim.5925
work_keys_str_mv AT schuemiemartijnj interpretingobservationalstudieswhyempiricalcalibrationisneededtocorrectpvalues
AT ryanpatrickb interpretingobservationalstudieswhyempiricalcalibrationisneededtocorrectpvalues
AT dumouchelwilliam interpretingobservationalstudieswhyempiricalcalibrationisneededtocorrectpvalues
AT suchardmarca interpretingobservationalstudieswhyempiricalcalibrationisneededtocorrectpvalues
AT madigandavid interpretingobservationalstudieswhyempiricalcalibrationisneededtocorrectpvalues