Cargando…

Missing Data in Clinical Research: A Tutorial on Multiple Imputation

Missing data is a common occurrence in clinical research. Missing data occurs when the value of the variables of interest are not measured or recorded for all subjects in the sample. Common approaches to addressing the presence of missing data include complete-case analyses, where subjects with miss...

Descripción completa

Detalles Bibliográficos
Autores principales:	Austin, Peter C., White, Ian R., Lee, Douglas S., van Buuren, Stef
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Pulsus Group 2021
Materias:	Review
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8499698/ https://www.ncbi.nlm.nih.gov/pubmed/33276049 http://dx.doi.org/10.1016/j.cjca.2020.11.010

_version_	1784580356657119232
author	Austin, Peter C. White, Ian R. Lee, Douglas S. van Buuren, Stef
author_facet	Austin, Peter C. White, Ian R. Lee, Douglas S. van Buuren, Stef
author_sort	Austin, Peter C.
collection	PubMed
description	Missing data is a common occurrence in clinical research. Missing data occurs when the value of the variables of interest are not measured or recorded for all subjects in the sample. Common approaches to addressing the presence of missing data include complete-case analyses, where subjects with missing data are excluded, and mean-value imputation, where missing values are replaced with the mean value of that variable in those subjects for whom it is not missing. However, in many settings, these approaches can lead to biased estimates of statistics (eg, of regression coefficients) and/or confidence intervals that are artificially narrow. Multiple imputation (MI) is a popular approach for addressing the presence of missing data. With MI, multiple plausible values of a given variable are imputed or filled in for each subject who has missing data for that variable. This results in the creation of multiple completed data sets. Identical statistical analyses are conducted in each of these complete data sets and the results are pooled across complete data sets. We provide an introduction to MI and discuss issues in its implementation, including developing the imputation model, how many imputed data sets to create, and addressing derived variables. We illustrate the application of MI through an analysis of data on patients hospitalised with heart failure. We focus on developing a model to estimate the probability of 1-year mortality in the presence of missing data. Statistical software code for conducting MI in R, SAS, and Stata are provided.
format	Online Article Text
id	pubmed-8499698
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Pulsus Group
record_format	MEDLINE/PubMed
spelling	pubmed-84996982021-10-12 Missing Data in Clinical Research: A Tutorial on Multiple Imputation Austin, Peter C. White, Ian R. Lee, Douglas S. van Buuren, Stef Can J Cardiol Review Missing data is a common occurrence in clinical research. Missing data occurs when the value of the variables of interest are not measured or recorded for all subjects in the sample. Common approaches to addressing the presence of missing data include complete-case analyses, where subjects with missing data are excluded, and mean-value imputation, where missing values are replaced with the mean value of that variable in those subjects for whom it is not missing. However, in many settings, these approaches can lead to biased estimates of statistics (eg, of regression coefficients) and/or confidence intervals that are artificially narrow. Multiple imputation (MI) is a popular approach for addressing the presence of missing data. With MI, multiple plausible values of a given variable are imputed or filled in for each subject who has missing data for that variable. This results in the creation of multiple completed data sets. Identical statistical analyses are conducted in each of these complete data sets and the results are pooled across complete data sets. We provide an introduction to MI and discuss issues in its implementation, including developing the imputation model, how many imputed data sets to create, and addressing derived variables. We illustrate the application of MI through an analysis of data on patients hospitalised with heart failure. We focus on developing a model to estimate the probability of 1-year mortality in the presence of missing data. Statistical software code for conducting MI in R, SAS, and Stata are provided. Pulsus Group 2021-09 /pmc/articles/PMC8499698/ /pubmed/33276049 http://dx.doi.org/10.1016/j.cjca.2020.11.010 Text en © 2020 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Review Austin, Peter C. White, Ian R. Lee, Douglas S. van Buuren, Stef Missing Data in Clinical Research: A Tutorial on Multiple Imputation
title	Missing Data in Clinical Research: A Tutorial on Multiple Imputation
title_full	Missing Data in Clinical Research: A Tutorial on Multiple Imputation
title_fullStr	Missing Data in Clinical Research: A Tutorial on Multiple Imputation
title_full_unstemmed	Missing Data in Clinical Research: A Tutorial on Multiple Imputation
title_short	Missing Data in Clinical Research: A Tutorial on Multiple Imputation
title_sort	missing data in clinical research: a tutorial on multiple imputation
topic	Review
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8499698/ https://www.ncbi.nlm.nih.gov/pubmed/33276049 http://dx.doi.org/10.1016/j.cjca.2020.11.010
work_keys_str_mv	AT austinpeterc missingdatainclinicalresearchatutorialonmultipleimputation AT whiteianr missingdatainclinicalresearchatutorialonmultipleimputation AT leedouglass missingdatainclinicalresearchatutorialonmultipleimputation AT vanbuurenstef missingdatainclinicalresearchatutorialonmultipleimputation

Missing Data in Clinical Research: A Tutorial on Multiple Imputation

Ejemplares similares