Cargando…

Evaluation of multiple imputation approaches for handling missing covariate information in a case-cohort study with a binary outcome

BACKGROUND: In case-cohort studies a random subcohort is selected from the inception cohort and acts as the sample of controls for several outcome investigations. Analysis is conducted using only the cases and the subcohort, with inverse probability weighting (IPW) used to account for the unequal sa...

Descripción completa

Detalles Bibliográficos
Autores principales: Middleton, Melissa, Nguyen, Cattram, Moreno-Betancur, Margarita, Carlin, John B., Lee, Katherine J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8978363/
https://www.ncbi.nlm.nih.gov/pubmed/35369860
http://dx.doi.org/10.1186/s12874-021-01495-4
_version_ 1784680948457013248
author Middleton, Melissa
Nguyen, Cattram
Moreno-Betancur, Margarita
Carlin, John B.
Lee, Katherine J.
author_facet Middleton, Melissa
Nguyen, Cattram
Moreno-Betancur, Margarita
Carlin, John B.
Lee, Katherine J.
author_sort Middleton, Melissa
collection PubMed
description BACKGROUND: In case-cohort studies a random subcohort is selected from the inception cohort and acts as the sample of controls for several outcome investigations. Analysis is conducted using only the cases and the subcohort, with inverse probability weighting (IPW) used to account for the unequal sampling probabilities resulting from the study design. Like all epidemiological studies, case-cohort studies are susceptible to missing data. Multiple imputation (MI) has become increasingly popular for addressing missing data in epidemiological studies. It is currently unclear how best to incorporate the weights from a case-cohort analysis in MI procedures used to address missing covariate data. METHOD: A simulation study was conducted with missingness in two covariates, motivated by a case study within the Barwon Infant Study. MI methods considered were: using the outcome, a proxy for weights in the simple case-cohort design considered, as a predictor in the imputation model, with and without exposure and covariate interactions; imputing separately within each weight category; and using a weighted imputation model. These methods were compared to a complete case analysis (CCA) within the context of a standard IPW analysis model estimating either the risk or odds ratio. The strength of associations, missing data mechanism, proportion of observations with incomplete covariate data, and subcohort selection probability varied across the simulation scenarios. Methods were also applied to the case study. RESULTS: There was similar performance in terms of relative bias and precision with all MI methods across the scenarios considered, with expected improvements compared with the CCA. Slight underestimation of the standard error was seen throughout but the nominal level of coverage (95%) was generally achieved. All MI methods showed a similar increase in precision as the subcohort selection probability increased, irrespective of the scenario. A similar pattern of results was seen in the case study. CONCLUSIONS: How weights were incorporated into the imputation model had minimal effect on the performance of MI; this may be due to case-cohort studies only having two weight categories. In this context, inclusion of the outcome in the imputation model was sufficient to account for the unequal sampling probabilities in the analysis model. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12874-021-01495-4.
format Online
Article
Text
id pubmed-8978363
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-89783632022-04-05 Evaluation of multiple imputation approaches for handling missing covariate information in a case-cohort study with a binary outcome Middleton, Melissa Nguyen, Cattram Moreno-Betancur, Margarita Carlin, John B. Lee, Katherine J. BMC Med Res Methodol Research BACKGROUND: In case-cohort studies a random subcohort is selected from the inception cohort and acts as the sample of controls for several outcome investigations. Analysis is conducted using only the cases and the subcohort, with inverse probability weighting (IPW) used to account for the unequal sampling probabilities resulting from the study design. Like all epidemiological studies, case-cohort studies are susceptible to missing data. Multiple imputation (MI) has become increasingly popular for addressing missing data in epidemiological studies. It is currently unclear how best to incorporate the weights from a case-cohort analysis in MI procedures used to address missing covariate data. METHOD: A simulation study was conducted with missingness in two covariates, motivated by a case study within the Barwon Infant Study. MI methods considered were: using the outcome, a proxy for weights in the simple case-cohort design considered, as a predictor in the imputation model, with and without exposure and covariate interactions; imputing separately within each weight category; and using a weighted imputation model. These methods were compared to a complete case analysis (CCA) within the context of a standard IPW analysis model estimating either the risk or odds ratio. The strength of associations, missing data mechanism, proportion of observations with incomplete covariate data, and subcohort selection probability varied across the simulation scenarios. Methods were also applied to the case study. RESULTS: There was similar performance in terms of relative bias and precision with all MI methods across the scenarios considered, with expected improvements compared with the CCA. Slight underestimation of the standard error was seen throughout but the nominal level of coverage (95%) was generally achieved. All MI methods showed a similar increase in precision as the subcohort selection probability increased, irrespective of the scenario. A similar pattern of results was seen in the case study. CONCLUSIONS: How weights were incorporated into the imputation model had minimal effect on the performance of MI; this may be due to case-cohort studies only having two weight categories. In this context, inclusion of the outcome in the imputation model was sufficient to account for the unequal sampling probabilities in the analysis model. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12874-021-01495-4. BioMed Central 2022-04-03 /pmc/articles/PMC8978363/ /pubmed/35369860 http://dx.doi.org/10.1186/s12874-021-01495-4 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Middleton, Melissa
Nguyen, Cattram
Moreno-Betancur, Margarita
Carlin, John B.
Lee, Katherine J.
Evaluation of multiple imputation approaches for handling missing covariate information in a case-cohort study with a binary outcome
title Evaluation of multiple imputation approaches for handling missing covariate information in a case-cohort study with a binary outcome
title_full Evaluation of multiple imputation approaches for handling missing covariate information in a case-cohort study with a binary outcome
title_fullStr Evaluation of multiple imputation approaches for handling missing covariate information in a case-cohort study with a binary outcome
title_full_unstemmed Evaluation of multiple imputation approaches for handling missing covariate information in a case-cohort study with a binary outcome
title_short Evaluation of multiple imputation approaches for handling missing covariate information in a case-cohort study with a binary outcome
title_sort evaluation of multiple imputation approaches for handling missing covariate information in a case-cohort study with a binary outcome
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8978363/
https://www.ncbi.nlm.nih.gov/pubmed/35369860
http://dx.doi.org/10.1186/s12874-021-01495-4
work_keys_str_mv AT middletonmelissa evaluationofmultipleimputationapproachesforhandlingmissingcovariateinformationinacasecohortstudywithabinaryoutcome
AT nguyencattram evaluationofmultipleimputationapproachesforhandlingmissingcovariateinformationinacasecohortstudywithabinaryoutcome
AT morenobetancurmargarita evaluationofmultipleimputationapproachesforhandlingmissingcovariateinformationinacasecohortstudywithabinaryoutcome
AT carlinjohnb evaluationofmultipleimputationapproachesforhandlingmissingcovariateinformationinacasecohortstudywithabinaryoutcome
AT leekatherinej evaluationofmultipleimputationapproachesforhandlingmissingcovariateinformationinacasecohortstudywithabinaryoutcome