Cargando…

Multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods

BACKGROUND: Multiple imputation is often used for missing data. When a model contains as covariates more than one function of a variable, it is not obvious how best to impute missing values in these covariates. Consider a regression with outcome Y and covariates X and X(2). In 'passive imputati...

Descripción completa

Detalles Bibliográficos
Autores principales: Seaman, Shaun R, Bartlett, Jonathan W, White, Ian R
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3403931/
https://www.ncbi.nlm.nih.gov/pubmed/22489953
http://dx.doi.org/10.1186/1471-2288-12-46
_version_ 1782238948753408000
author Seaman, Shaun R
Bartlett, Jonathan W
White, Ian R
author_facet Seaman, Shaun R
Bartlett, Jonathan W
White, Ian R
author_sort Seaman, Shaun R
collection PubMed
description BACKGROUND: Multiple imputation is often used for missing data. When a model contains as covariates more than one function of a variable, it is not obvious how best to impute missing values in these covariates. Consider a regression with outcome Y and covariates X and X(2). In 'passive imputation' a value X* is imputed for X and then X(2 )is imputed as (X*)(2). A recent proposal is to treat X(2 )as 'just another variable' (JAV) and impute X and X(2 )under multivariate normality. METHODS: We use simulation to investigate the performance of three methods that can easily be implemented in standard software: 1) linear regression of X on Y to impute X then passive imputation of X(2); 2) the same regression but with predictive mean matching (PMM); and 3) JAV. We also investigate the performance of analogous methods when the analysis involves an interaction, and study the theoretical properties of JAV. The application of the methods when complete or incomplete confounders are also present is illustrated using data from the EPIC Study. RESULTS: JAV gives consistent estimation when the analysis is linear regression with a quadratic or interaction term and X is missing completely at random. When X is missing at random, JAV may be biased, but this bias is generally less than for passive imputation and PMM. Coverage for JAV was usually good when bias was small. However, in some scenarios with a more pronounced quadratic effect, bias was large and coverage poor. When the analysis was logistic regression, JAV's performance was sometimes very poor. PMM generally improved on passive imputation, in terms of bias and coverage, but did not eliminate the bias. CONCLUSIONS: Given the current state of available software, JAV is the best of a set of imperfect imputation methods for linear regression with a quadratic or interaction effect, but should not be used for logistic regression.
format Online
Article
Text
id pubmed-3403931
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34039312012-07-27 Multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods Seaman, Shaun R Bartlett, Jonathan W White, Ian R BMC Med Res Methodol Research Article BACKGROUND: Multiple imputation is often used for missing data. When a model contains as covariates more than one function of a variable, it is not obvious how best to impute missing values in these covariates. Consider a regression with outcome Y and covariates X and X(2). In 'passive imputation' a value X* is imputed for X and then X(2 )is imputed as (X*)(2). A recent proposal is to treat X(2 )as 'just another variable' (JAV) and impute X and X(2 )under multivariate normality. METHODS: We use simulation to investigate the performance of three methods that can easily be implemented in standard software: 1) linear regression of X on Y to impute X then passive imputation of X(2); 2) the same regression but with predictive mean matching (PMM); and 3) JAV. We also investigate the performance of analogous methods when the analysis involves an interaction, and study the theoretical properties of JAV. The application of the methods when complete or incomplete confounders are also present is illustrated using data from the EPIC Study. RESULTS: JAV gives consistent estimation when the analysis is linear regression with a quadratic or interaction term and X is missing completely at random. When X is missing at random, JAV may be biased, but this bias is generally less than for passive imputation and PMM. Coverage for JAV was usually good when bias was small. However, in some scenarios with a more pronounced quadratic effect, bias was large and coverage poor. When the analysis was logistic regression, JAV's performance was sometimes very poor. PMM generally improved on passive imputation, in terms of bias and coverage, but did not eliminate the bias. CONCLUSIONS: Given the current state of available software, JAV is the best of a set of imperfect imputation methods for linear regression with a quadratic or interaction effect, but should not be used for logistic regression. BioMed Central 2012-04-10 /pmc/articles/PMC3403931/ /pubmed/22489953 http://dx.doi.org/10.1186/1471-2288-12-46 Text en Copyright ©2012 Seaman et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Seaman, Shaun R
Bartlett, Jonathan W
White, Ian R
Multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods
title Multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods
title_full Multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods
title_fullStr Multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods
title_full_unstemmed Multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods
title_short Multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods
title_sort multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3403931/
https://www.ncbi.nlm.nih.gov/pubmed/22489953
http://dx.doi.org/10.1186/1471-2288-12-46
work_keys_str_mv AT seamanshaunr multipleimputationofmissingcovariateswithnonlineareffectsandinteractionsanevaluationofstatisticalmethods
AT bartlettjonathanw multipleimputationofmissingcovariateswithnonlineareffectsandinteractionsanevaluationofstatisticalmethods
AT whiteianr multipleimputationofmissingcovariateswithnonlineareffectsandinteractionsanevaluationofstatisticalmethods