Cargando…

The use of complete-case and multiple imputation-based analyses in molecular epidemiology studies that assess interaction effects

BACKGROUND: In molecular epidemiology studies biospecimen data are collected, often with the purpose of evaluating the synergistic role between a biomarker and another feature on an outcome. Typically, biomarker data are collected on only a proportion of subjects eligible for study, leading to a mis...

Descripción completa

Detalles Bibliográficos
Autores principales: Desai, Manisha, Esserman, Denise A, Gammon, Marilie D, Terry, Mary B
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3217865/
https://www.ncbi.nlm.nih.gov/pubmed/21978450
http://dx.doi.org/10.1186/1742-5573-8-5
_version_ 1782216620713705472
author Desai, Manisha
Esserman, Denise A
Gammon, Marilie D
Terry, Mary B
author_facet Desai, Manisha
Esserman, Denise A
Gammon, Marilie D
Terry, Mary B
author_sort Desai, Manisha
collection PubMed
description BACKGROUND: In molecular epidemiology studies biospecimen data are collected, often with the purpose of evaluating the synergistic role between a biomarker and another feature on an outcome. Typically, biomarker data are collected on only a proportion of subjects eligible for study, leading to a missing data problem. Missing data methods, however, are not customarily incorporated into analyses. Instead, complete-case (CC) analyses are performed, which can result in biased and inefficient estimates. METHODS: Through simulations, we characterized the performance of CC methods when interaction effects are estimated. We also investigated whether standard multiple imputation (MI) could improve estimation over CC methods when the data are not missing at random (NMAR) and auxiliary information may or may not exist. RESULTS: CC analyses were shown to result in considerable bias and efficiency loss. While MI reduced bias and increased efficiency over CC methods under specific conditions, it too resulted in biased estimates depending on the strength of the auxiliary data available and the nature of the missingness. In particular, CC performed better than MI when extreme values of the covariate were more likely to be missing, while MI outperformed CC when missingness of the covariate related to both the covariate and outcome. MI always improved performance when strong auxiliary data were available. In a real study, MI estimates of interaction effects were attenuated relative to those from a CC approach. CONCLUSIONS: Our findings suggest the importance of incorporating missing data methods into the analysis. If the data are MAR, standard MI is a reasonable method. Auxiliary variables may make this assumption more reasonable even if the data are NMAR. Under NMAR we emphasize caution when using standard MI and recommend it over CC only when strong auxiliary data are available. MI, with the missing data mechanism specified, is an alternative when the data are NMAR. In all cases, it is recommended to take advantage of MI's ability to account for the uncertainty of these assumptions.
format Online
Article
Text
id pubmed-3217865
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32178652011-11-17 The use of complete-case and multiple imputation-based analyses in molecular epidemiology studies that assess interaction effects Desai, Manisha Esserman, Denise A Gammon, Marilie D Terry, Mary B Epidemiol Perspect Innov Analytic Perspective BACKGROUND: In molecular epidemiology studies biospecimen data are collected, often with the purpose of evaluating the synergistic role between a biomarker and another feature on an outcome. Typically, biomarker data are collected on only a proportion of subjects eligible for study, leading to a missing data problem. Missing data methods, however, are not customarily incorporated into analyses. Instead, complete-case (CC) analyses are performed, which can result in biased and inefficient estimates. METHODS: Through simulations, we characterized the performance of CC methods when interaction effects are estimated. We also investigated whether standard multiple imputation (MI) could improve estimation over CC methods when the data are not missing at random (NMAR) and auxiliary information may or may not exist. RESULTS: CC analyses were shown to result in considerable bias and efficiency loss. While MI reduced bias and increased efficiency over CC methods under specific conditions, it too resulted in biased estimates depending on the strength of the auxiliary data available and the nature of the missingness. In particular, CC performed better than MI when extreme values of the covariate were more likely to be missing, while MI outperformed CC when missingness of the covariate related to both the covariate and outcome. MI always improved performance when strong auxiliary data were available. In a real study, MI estimates of interaction effects were attenuated relative to those from a CC approach. CONCLUSIONS: Our findings suggest the importance of incorporating missing data methods into the analysis. If the data are MAR, standard MI is a reasonable method. Auxiliary variables may make this assumption more reasonable even if the data are NMAR. Under NMAR we emphasize caution when using standard MI and recommend it over CC only when strong auxiliary data are available. MI, with the missing data mechanism specified, is an alternative when the data are NMAR. In all cases, it is recommended to take advantage of MI's ability to account for the uncertainty of these assumptions. BioMed Central 2011-10-06 /pmc/articles/PMC3217865/ /pubmed/21978450 http://dx.doi.org/10.1186/1742-5573-8-5 Text en Copyright ©2011 Desai et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Analytic Perspective
Desai, Manisha
Esserman, Denise A
Gammon, Marilie D
Terry, Mary B
The use of complete-case and multiple imputation-based analyses in molecular epidemiology studies that assess interaction effects
title The use of complete-case and multiple imputation-based analyses in molecular epidemiology studies that assess interaction effects
title_full The use of complete-case and multiple imputation-based analyses in molecular epidemiology studies that assess interaction effects
title_fullStr The use of complete-case and multiple imputation-based analyses in molecular epidemiology studies that assess interaction effects
title_full_unstemmed The use of complete-case and multiple imputation-based analyses in molecular epidemiology studies that assess interaction effects
title_short The use of complete-case and multiple imputation-based analyses in molecular epidemiology studies that assess interaction effects
title_sort use of complete-case and multiple imputation-based analyses in molecular epidemiology studies that assess interaction effects
topic Analytic Perspective
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3217865/
https://www.ncbi.nlm.nih.gov/pubmed/21978450
http://dx.doi.org/10.1186/1742-5573-8-5
work_keys_str_mv AT desaimanisha theuseofcompletecaseandmultipleimputationbasedanalysesinmolecularepidemiologystudiesthatassessinteractioneffects
AT essermandenisea theuseofcompletecaseandmultipleimputationbasedanalysesinmolecularepidemiologystudiesthatassessinteractioneffects
AT gammonmarilied theuseofcompletecaseandmultipleimputationbasedanalysesinmolecularepidemiologystudiesthatassessinteractioneffects
AT terrymaryb theuseofcompletecaseandmultipleimputationbasedanalysesinmolecularepidemiologystudiesthatassessinteractioneffects
AT desaimanisha useofcompletecaseandmultipleimputationbasedanalysesinmolecularepidemiologystudiesthatassessinteractioneffects
AT essermandenisea useofcompletecaseandmultipleimputationbasedanalysesinmolecularepidemiologystudiesthatassessinteractioneffects
AT gammonmarilied useofcompletecaseandmultipleimputationbasedanalysesinmolecularepidemiologystudiesthatassessinteractioneffects
AT terrymaryb useofcompletecaseandmultipleimputationbasedanalysesinmolecularepidemiologystudiesthatassessinteractioneffects