Cargando…

A comparison of multiple imputation methods for missing data in longitudinal studies

BACKGROUND: Multiple imputation (MI) is now widely used to handle missing data in longitudinal studies. Several MI techniques have been proposed to impute incomplete longitudinal covariates, including standard fully conditional specification (FCS-Standard) and joint multivariate normal imputation (J...

Descripción completa

Detalles Bibliográficos
Autores principales: Huque, Md Hamidul, Carlin, John B., Simpson, Julie A., Lee, Katherine J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6292063/
https://www.ncbi.nlm.nih.gov/pubmed/30541455
http://dx.doi.org/10.1186/s12874-018-0615-6
_version_ 1783380339891109888
author Huque, Md Hamidul
Carlin, John B.
Simpson, Julie A.
Lee, Katherine J.
author_facet Huque, Md Hamidul
Carlin, John B.
Simpson, Julie A.
Lee, Katherine J.
author_sort Huque, Md Hamidul
collection PubMed
description BACKGROUND: Multiple imputation (MI) is now widely used to handle missing data in longitudinal studies. Several MI techniques have been proposed to impute incomplete longitudinal covariates, including standard fully conditional specification (FCS-Standard) and joint multivariate normal imputation (JM-MVN), which treat repeated measurements as distinct variables, and various extensions based on generalized linear mixed models. Although these MI approaches have been implemented in various software packages, there has not been a comprehensive evaluation of the relative performance of these methods in the context of longitudinal data. METHOD: Using both empirical data and a simulation study based on data from the six waves of the Longitudinal Study of Australian Children (N = 4661), we investigated the performance of a wide range of MI methods available in standard software packages for investigating the association between child body mass index (BMI) and quality of life using both a linear regression and a linear mixed-effects model. RESULTS: In this paper, we have identified and compared 12 different MI methods for imputing missing data in longitudinal studies. Analysis of simulated data under missing at random (MAR) mechanisms showed that the generally available MI methods provided less biased estimates with better coverage for the linear regression model and around half of these methods performed well for the estimation of regression parameters for a linear mixed model with random intercept. With the observed data, we observed an inverse association between child BMI and quality of life, with available data as well as multiple imputation. CONCLUSION: Both FCS-Standard and JM-MVN performed well for the estimation of regression parameters in both analysis models. More complex methods that explicitly reflect the longitudinal structure for these analysis models may only be needed in specific circumstances such as irregularly spaced data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12874-018-0615-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6292063
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-62920632018-12-17 A comparison of multiple imputation methods for missing data in longitudinal studies Huque, Md Hamidul Carlin, John B. Simpson, Julie A. Lee, Katherine J. BMC Med Res Methodol Research Article BACKGROUND: Multiple imputation (MI) is now widely used to handle missing data in longitudinal studies. Several MI techniques have been proposed to impute incomplete longitudinal covariates, including standard fully conditional specification (FCS-Standard) and joint multivariate normal imputation (JM-MVN), which treat repeated measurements as distinct variables, and various extensions based on generalized linear mixed models. Although these MI approaches have been implemented in various software packages, there has not been a comprehensive evaluation of the relative performance of these methods in the context of longitudinal data. METHOD: Using both empirical data and a simulation study based on data from the six waves of the Longitudinal Study of Australian Children (N = 4661), we investigated the performance of a wide range of MI methods available in standard software packages for investigating the association between child body mass index (BMI) and quality of life using both a linear regression and a linear mixed-effects model. RESULTS: In this paper, we have identified and compared 12 different MI methods for imputing missing data in longitudinal studies. Analysis of simulated data under missing at random (MAR) mechanisms showed that the generally available MI methods provided less biased estimates with better coverage for the linear regression model and around half of these methods performed well for the estimation of regression parameters for a linear mixed model with random intercept. With the observed data, we observed an inverse association between child BMI and quality of life, with available data as well as multiple imputation. CONCLUSION: Both FCS-Standard and JM-MVN performed well for the estimation of regression parameters in both analysis models. More complex methods that explicitly reflect the longitudinal structure for these analysis models may only be needed in specific circumstances such as irregularly spaced data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12874-018-0615-6) contains supplementary material, which is available to authorized users. BioMed Central 2018-12-12 /pmc/articles/PMC6292063/ /pubmed/30541455 http://dx.doi.org/10.1186/s12874-018-0615-6 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Huque, Md Hamidul
Carlin, John B.
Simpson, Julie A.
Lee, Katherine J.
A comparison of multiple imputation methods for missing data in longitudinal studies
title A comparison of multiple imputation methods for missing data in longitudinal studies
title_full A comparison of multiple imputation methods for missing data in longitudinal studies
title_fullStr A comparison of multiple imputation methods for missing data in longitudinal studies
title_full_unstemmed A comparison of multiple imputation methods for missing data in longitudinal studies
title_short A comparison of multiple imputation methods for missing data in longitudinal studies
title_sort comparison of multiple imputation methods for missing data in longitudinal studies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6292063/
https://www.ncbi.nlm.nih.gov/pubmed/30541455
http://dx.doi.org/10.1186/s12874-018-0615-6
work_keys_str_mv AT huquemdhamidul acomparisonofmultipleimputationmethodsformissingdatainlongitudinalstudies
AT carlinjohnb acomparisonofmultipleimputationmethodsformissingdatainlongitudinalstudies
AT simpsonjuliea acomparisonofmultipleimputationmethodsformissingdatainlongitudinalstudies
AT leekatherinej acomparisonofmultipleimputationmethodsformissingdatainlongitudinalstudies
AT huquemdhamidul comparisonofmultipleimputationmethodsformissingdatainlongitudinalstudies
AT carlinjohnb comparisonofmultipleimputationmethodsformissingdatainlongitudinalstudies
AT simpsonjuliea comparisonofmultipleimputationmethodsformissingdatainlongitudinalstudies
AT leekatherinej comparisonofmultipleimputationmethodsformissingdatainlongitudinalstudies