Cargando…
Multiple imputation for handling missing outcome data when estimating the relative risk
BACKGROUND: Multiple imputation is a popular approach to handling missing data in medical research, yet little is known about its applicability for estimating the relative risk. Standard methods for imputing incomplete binary outcomes involve logistic regression or an assumption of multivariate norm...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5588607/ https://www.ncbi.nlm.nih.gov/pubmed/28877666 http://dx.doi.org/10.1186/s12874-017-0414-5 |
_version_ | 1783262207410176000 |
---|---|
author | Sullivan, Thomas R. Lee, Katherine J. Ryan, Philip Salter, Amy B. |
author_facet | Sullivan, Thomas R. Lee, Katherine J. Ryan, Philip Salter, Amy B. |
author_sort | Sullivan, Thomas R. |
collection | PubMed |
description | BACKGROUND: Multiple imputation is a popular approach to handling missing data in medical research, yet little is known about its applicability for estimating the relative risk. Standard methods for imputing incomplete binary outcomes involve logistic regression or an assumption of multivariate normality, whereas relative risks are typically estimated using log binomial models. It is unclear whether misspecification of the imputation model in this setting could lead to biased parameter estimates. METHODS: Using simulated data, we evaluated the performance of multiple imputation for handling missing data prior to estimating adjusted relative risks from a correctly specified multivariable log binomial model. We considered an arbitrary pattern of missing data in both outcome and exposure variables, with missing data induced under missing at random mechanisms. Focusing on standard model-based methods of multiple imputation, missing data were imputed using multivariate normal imputation or fully conditional specification with a logistic imputation model for the outcome. RESULTS: Multivariate normal imputation performed poorly in the simulation study, consistently producing estimates of the relative risk that were biased towards the null. Despite outperforming multivariate normal imputation, fully conditional specification also produced somewhat biased estimates, with greater bias observed for higher outcome prevalences and larger relative risks. Deleting imputed outcomes from analysis datasets did not improve the performance of fully conditional specification. CONCLUSIONS: Both multivariate normal imputation and fully conditional specification produced biased estimates of the relative risk, presumably since both use a misspecified imputation model. Based on simulation results, we recommend researchers use fully conditional specification rather than multivariate normal imputation and retain imputed outcomes in the analysis when estimating relative risks. However fully conditional specification is not without its shortcomings, and so further research is needed to identify optimal approaches for relative risk estimation within the multiple imputation framework. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12874-017-0414-5) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5588607 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-55886072017-09-14 Multiple imputation for handling missing outcome data when estimating the relative risk Sullivan, Thomas R. Lee, Katherine J. Ryan, Philip Salter, Amy B. BMC Med Res Methodol Research Article BACKGROUND: Multiple imputation is a popular approach to handling missing data in medical research, yet little is known about its applicability for estimating the relative risk. Standard methods for imputing incomplete binary outcomes involve logistic regression or an assumption of multivariate normality, whereas relative risks are typically estimated using log binomial models. It is unclear whether misspecification of the imputation model in this setting could lead to biased parameter estimates. METHODS: Using simulated data, we evaluated the performance of multiple imputation for handling missing data prior to estimating adjusted relative risks from a correctly specified multivariable log binomial model. We considered an arbitrary pattern of missing data in both outcome and exposure variables, with missing data induced under missing at random mechanisms. Focusing on standard model-based methods of multiple imputation, missing data were imputed using multivariate normal imputation or fully conditional specification with a logistic imputation model for the outcome. RESULTS: Multivariate normal imputation performed poorly in the simulation study, consistently producing estimates of the relative risk that were biased towards the null. Despite outperforming multivariate normal imputation, fully conditional specification also produced somewhat biased estimates, with greater bias observed for higher outcome prevalences and larger relative risks. Deleting imputed outcomes from analysis datasets did not improve the performance of fully conditional specification. CONCLUSIONS: Both multivariate normal imputation and fully conditional specification produced biased estimates of the relative risk, presumably since both use a misspecified imputation model. Based on simulation results, we recommend researchers use fully conditional specification rather than multivariate normal imputation and retain imputed outcomes in the analysis when estimating relative risks. However fully conditional specification is not without its shortcomings, and so further research is needed to identify optimal approaches for relative risk estimation within the multiple imputation framework. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12874-017-0414-5) contains supplementary material, which is available to authorized users. BioMed Central 2017-09-06 /pmc/articles/PMC5588607/ /pubmed/28877666 http://dx.doi.org/10.1186/s12874-017-0414-5 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Sullivan, Thomas R. Lee, Katherine J. Ryan, Philip Salter, Amy B. Multiple imputation for handling missing outcome data when estimating the relative risk |
title | Multiple imputation for handling missing outcome data when estimating the relative risk |
title_full | Multiple imputation for handling missing outcome data when estimating the relative risk |
title_fullStr | Multiple imputation for handling missing outcome data when estimating the relative risk |
title_full_unstemmed | Multiple imputation for handling missing outcome data when estimating the relative risk |
title_short | Multiple imputation for handling missing outcome data when estimating the relative risk |
title_sort | multiple imputation for handling missing outcome data when estimating the relative risk |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5588607/ https://www.ncbi.nlm.nih.gov/pubmed/28877666 http://dx.doi.org/10.1186/s12874-017-0414-5 |
work_keys_str_mv | AT sullivanthomasr multipleimputationforhandlingmissingoutcomedatawhenestimatingtherelativerisk AT leekatherinej multipleimputationforhandlingmissingoutcomedatawhenestimatingtherelativerisk AT ryanphilip multipleimputationforhandlingmissingoutcomedatawhenestimatingtherelativerisk AT salteramyb multipleimputationforhandlingmissingoutcomedatawhenestimatingtherelativerisk |