Cargando…

On the Q statistic with constant weights in meta-analysis of binary outcomes

BACKGROUND: Cochran’s Q statistic is routinely used for testing heterogeneity in meta-analysis. Its expected value (under an incorrect null distribution) is part of several popular estimators of the between-study variance, [Formula: see text] . Those applications generally do not account for use of...

Descripción completa

Detalles Bibliográficos
Autores principales: Kulinskaya, Elena, Hoaglin, David C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10286409/
https://www.ncbi.nlm.nih.gov/pubmed/37344771
http://dx.doi.org/10.1186/s12874-023-01939-z
Descripción
Sumario:BACKGROUND: Cochran’s Q statistic is routinely used for testing heterogeneity in meta-analysis. Its expected value (under an incorrect null distribution) is part of several popular estimators of the between-study variance, [Formula: see text] . Those applications generally do not account for use of the studies’ estimated variances in the inverse-variance weights that define Q (more explicitly, [Formula: see text] ). Importantly, those weights make approximating the distribution of [Formula: see text] rather complicated. METHODS: As an alternative, we are investigating a Q statistic, [Formula: see text] , whose constant weights use only the studies’ arm-level sample sizes. For log-odds-ratio (LOR), log-relative-risk (LRR), and risk difference (RD) as the measures of effect, we study, by simulation, approximations to distributions of [Formula: see text] and [Formula: see text] , as the basis for tests of heterogeneity. RESULTS: The results show that: for LOR and LRR, a two-moment gamma approximation to the distribution of [Formula: see text] works well for small sample sizes, and an approximation based on an algorithm of Farebrother is recommended for larger sample sizes. For RD, the Farebrother approximation works very well, even for small sample sizes. For [Formula: see text] , the standard chi-square approximation provides levels that are much too low for LOR and LRR and too high for RD. The Kulinskaya et al. (Res Synth Methods 2:254–70, 2011) approximation for RD and the Kulinskaya and Dollinger (BMC Med Res Methodol 15:49, 2015) approximation for LOR work well for [Formula: see text] but have some convergence issues for very small sample sizes combined with small probabilities. CONCLUSIONS: The performance of the standard [Formula: see text] approximation is inadequate for all three binary effect measures. Instead, we recommend a test of heterogeneity based on [Formula: see text] and provide practical guidelines for choosing an appropriate test at the .05 level for all three effect measures. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12874-023-01939-z.