Cargando…
Multiple imputation with missing indicators as proxies for unmeasured variables: simulation study
BACKGROUND: Within routinely collected health data, missing data for an individual might provide useful information in itself. This occurs, for example, in the case of electronic health records, where the presence or absence of data is informative. While the naive use of missing indicators to try to...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7346454/ https://www.ncbi.nlm.nih.gov/pubmed/32640992 http://dx.doi.org/10.1186/s12874-020-01068-x |
_version_ | 1783556411250180096 |
---|---|
author | Sperrin, Matthew Martin, Glen P. |
author_facet | Sperrin, Matthew Martin, Glen P. |
author_sort | Sperrin, Matthew |
collection | PubMed |
description | BACKGROUND: Within routinely collected health data, missing data for an individual might provide useful information in itself. This occurs, for example, in the case of electronic health records, where the presence or absence of data is informative. While the naive use of missing indicators to try to exploit such information can introduce bias, its use in conjunction with multiple imputation may unlock the potential value of missingness to reduce bias in causal effect estimation, particularly in missing not at random scenarios and where missingness might be associated with unmeasured confounders. METHODS: We conducted a simulation study to determine when the use of a missing indicator, combined with multiple imputation, would reduce bias for causal effect estimation, under a range of scenarios including unmeasured variables, missing not at random, and missing at random mechanisms. We use directed acyclic graphs and structural models to elucidate a variety of causal structures of interest. We handled missing data using complete case analysis, and multiple imputation with and without missing indicator terms. RESULTS: We find that multiple imputation combined with a missing indicator gives minimal bias for causal effect estimation in most scenarios. In particular the approach: 1) does not introduce bias in missing (completely) at random scenarios; 2) reduces bias in missing not at random scenarios where the missing mechanism depends on the missing variable itself; and 3) may reduce or increase bias when unmeasured confounding is present. CONCLUSION: In the presence of missing data, careful use of missing indicators, combined with multiple imputation, can improve causal effect estimation when missingness is informative, and is not detrimental when missingness is at random. |
format | Online Article Text |
id | pubmed-7346454 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-73464542020-07-14 Multiple imputation with missing indicators as proxies for unmeasured variables: simulation study Sperrin, Matthew Martin, Glen P. BMC Med Res Methodol Research Article BACKGROUND: Within routinely collected health data, missing data for an individual might provide useful information in itself. This occurs, for example, in the case of electronic health records, where the presence or absence of data is informative. While the naive use of missing indicators to try to exploit such information can introduce bias, its use in conjunction with multiple imputation may unlock the potential value of missingness to reduce bias in causal effect estimation, particularly in missing not at random scenarios and where missingness might be associated with unmeasured confounders. METHODS: We conducted a simulation study to determine when the use of a missing indicator, combined with multiple imputation, would reduce bias for causal effect estimation, under a range of scenarios including unmeasured variables, missing not at random, and missing at random mechanisms. We use directed acyclic graphs and structural models to elucidate a variety of causal structures of interest. We handled missing data using complete case analysis, and multiple imputation with and without missing indicator terms. RESULTS: We find that multiple imputation combined with a missing indicator gives minimal bias for causal effect estimation in most scenarios. In particular the approach: 1) does not introduce bias in missing (completely) at random scenarios; 2) reduces bias in missing not at random scenarios where the missing mechanism depends on the missing variable itself; and 3) may reduce or increase bias when unmeasured confounding is present. CONCLUSION: In the presence of missing data, careful use of missing indicators, combined with multiple imputation, can improve causal effect estimation when missingness is informative, and is not detrimental when missingness is at random. BioMed Central 2020-07-08 /pmc/articles/PMC7346454/ /pubmed/32640992 http://dx.doi.org/10.1186/s12874-020-01068-x Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Sperrin, Matthew Martin, Glen P. Multiple imputation with missing indicators as proxies for unmeasured variables: simulation study |
title | Multiple imputation with missing indicators as proxies for unmeasured variables: simulation study |
title_full | Multiple imputation with missing indicators as proxies for unmeasured variables: simulation study |
title_fullStr | Multiple imputation with missing indicators as proxies for unmeasured variables: simulation study |
title_full_unstemmed | Multiple imputation with missing indicators as proxies for unmeasured variables: simulation study |
title_short | Multiple imputation with missing indicators as proxies for unmeasured variables: simulation study |
title_sort | multiple imputation with missing indicators as proxies for unmeasured variables: simulation study |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7346454/ https://www.ncbi.nlm.nih.gov/pubmed/32640992 http://dx.doi.org/10.1186/s12874-020-01068-x |
work_keys_str_mv | AT sperrinmatthew multipleimputationwithmissingindicatorsasproxiesforunmeasuredvariablessimulationstudy AT martinglenp multipleimputationwithmissingindicatorsasproxiesforunmeasuredvariablessimulationstudy |