Cargando…

Imputation strategies for missing binary outcomes in cluster randomized trials

BACKGROUND: Attrition, which leads to missing data, is a common problem in cluster randomized trials (CRTs), where groups of patients rather than individuals are randomized. Standard multiple imputation (MI) strategies may not be appropriate to impute missing data from CRTs since they assume indepen...

Descripción completa

Detalles Bibliográficos
Autores principales: Ma, Jinhui, Akhtar-Danesh, Noori, Dolovich, Lisa, Thabane, Lehana
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3055218/
https://www.ncbi.nlm.nih.gov/pubmed/21324148
http://dx.doi.org/10.1186/1471-2288-11-18
_version_ 1782200122777534464
author Ma, Jinhui
Akhtar-Danesh, Noori
Dolovich, Lisa
Thabane, Lehana
author_facet Ma, Jinhui
Akhtar-Danesh, Noori
Dolovich, Lisa
Thabane, Lehana
author_sort Ma, Jinhui
collection PubMed
description BACKGROUND: Attrition, which leads to missing data, is a common problem in cluster randomized trials (CRTs), where groups of patients rather than individuals are randomized. Standard multiple imputation (MI) strategies may not be appropriate to impute missing data from CRTs since they assume independent data. In this paper, under the assumption of missing completely at random and covariate dependent missing, we compared six MI strategies which account for the intra-cluster correlation for missing binary outcomes in CRTs with the standard imputation strategies and complete case analysis approach using a simulation study. METHOD: We considered three within-cluster and three across-cluster MI strategies for missing binary outcomes in CRTs. The three within-cluster MI strategies are logistic regression method, propensity score method, and Markov chain Monte Carlo (MCMC) method, which apply standard MI strategies within each cluster. The three across-cluster MI strategies are propensity score method, random-effects (RE) logistic regression approach, and logistic regression with cluster as a fixed effect. Based on the community hypertension assessment trial (CHAT) which has complete data, we designed a simulation study to investigate the performance of above MI strategies. RESULTS: The estimated treatment effect and its 95% confidence interval (CI) from generalized estimating equations (GEE) model based on the CHAT complete dataset are 1.14 (0.76 1.70). When 30% of binary outcome are missing completely at random, a simulation study shows that the estimated treatment effects and the corresponding 95% CIs from GEE model are 1.15 (0.76 1.75) if complete case analysis is used, 1.12 (0.72 1.73) if within-cluster MCMC method is used, 1.21 (0.80 1.81) if across-cluster RE logistic regression is used, and 1.16 (0.82 1.64) if standard logistic regression which does not account for clustering is used. CONCLUSION: When the percentage of missing data is low or intra-cluster correlation coefficient is small, different approaches for handling missing binary outcome data generate quite similar results. When the percentage of missing data is large, standard MI strategies, which do not take into account the intra-cluster correlation, underestimate the variance of the treatment effect. Within-cluster and across-cluster MI strategies (except for random-effects logistic regression MI strategy), which take the intra-cluster correlation into account, seem to be more appropriate to handle the missing outcome from CRTs. Under the same imputation strategy and percentage of missingness, the estimates of the treatment effect from GEE and RE logistic regression models are similar.
format Text
id pubmed-3055218
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30552182011-03-15 Imputation strategies for missing binary outcomes in cluster randomized trials Ma, Jinhui Akhtar-Danesh, Noori Dolovich, Lisa Thabane, Lehana BMC Med Res Methodol Research Article BACKGROUND: Attrition, which leads to missing data, is a common problem in cluster randomized trials (CRTs), where groups of patients rather than individuals are randomized. Standard multiple imputation (MI) strategies may not be appropriate to impute missing data from CRTs since they assume independent data. In this paper, under the assumption of missing completely at random and covariate dependent missing, we compared six MI strategies which account for the intra-cluster correlation for missing binary outcomes in CRTs with the standard imputation strategies and complete case analysis approach using a simulation study. METHOD: We considered three within-cluster and three across-cluster MI strategies for missing binary outcomes in CRTs. The three within-cluster MI strategies are logistic regression method, propensity score method, and Markov chain Monte Carlo (MCMC) method, which apply standard MI strategies within each cluster. The three across-cluster MI strategies are propensity score method, random-effects (RE) logistic regression approach, and logistic regression with cluster as a fixed effect. Based on the community hypertension assessment trial (CHAT) which has complete data, we designed a simulation study to investigate the performance of above MI strategies. RESULTS: The estimated treatment effect and its 95% confidence interval (CI) from generalized estimating equations (GEE) model based on the CHAT complete dataset are 1.14 (0.76 1.70). When 30% of binary outcome are missing completely at random, a simulation study shows that the estimated treatment effects and the corresponding 95% CIs from GEE model are 1.15 (0.76 1.75) if complete case analysis is used, 1.12 (0.72 1.73) if within-cluster MCMC method is used, 1.21 (0.80 1.81) if across-cluster RE logistic regression is used, and 1.16 (0.82 1.64) if standard logistic regression which does not account for clustering is used. CONCLUSION: When the percentage of missing data is low or intra-cluster correlation coefficient is small, different approaches for handling missing binary outcome data generate quite similar results. When the percentage of missing data is large, standard MI strategies, which do not take into account the intra-cluster correlation, underestimate the variance of the treatment effect. Within-cluster and across-cluster MI strategies (except for random-effects logistic regression MI strategy), which take the intra-cluster correlation into account, seem to be more appropriate to handle the missing outcome from CRTs. Under the same imputation strategy and percentage of missingness, the estimates of the treatment effect from GEE and RE logistic regression models are similar. BioMed Central 2011-02-16 /pmc/articles/PMC3055218/ /pubmed/21324148 http://dx.doi.org/10.1186/1471-2288-11-18 Text en Copyright ©2011 Ma et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Ma, Jinhui
Akhtar-Danesh, Noori
Dolovich, Lisa
Thabane, Lehana
Imputation strategies for missing binary outcomes in cluster randomized trials
title Imputation strategies for missing binary outcomes in cluster randomized trials
title_full Imputation strategies for missing binary outcomes in cluster randomized trials
title_fullStr Imputation strategies for missing binary outcomes in cluster randomized trials
title_full_unstemmed Imputation strategies for missing binary outcomes in cluster randomized trials
title_short Imputation strategies for missing binary outcomes in cluster randomized trials
title_sort imputation strategies for missing binary outcomes in cluster randomized trials
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3055218/
https://www.ncbi.nlm.nih.gov/pubmed/21324148
http://dx.doi.org/10.1186/1471-2288-11-18
work_keys_str_mv AT majinhui imputationstrategiesformissingbinaryoutcomesinclusterrandomizedtrials
AT akhtardaneshnoori imputationstrategiesformissingbinaryoutcomesinclusterrandomizedtrials
AT dolovichlisa imputationstrategiesformissingbinaryoutcomesinclusterrandomizedtrials
AT thabanelehana imputationstrategiesformissingbinaryoutcomesinclusterrandomizedtrials