Cargando…

Missing in space: an evaluation of imputation methods for missing data in spatial analysis of risk factors for type II diabetes

BACKGROUND: Spatial analysis is increasingly important for identifying modifiable geographic risk factors for disease. However, spatial health data from surveys are often incomplete, ranging from missing data for only a few variables, to missing data for many variables. For spatial analyses of healt...

Descripción completa

Detalles Bibliográficos
Autores principales: Baker, Jannah, White, Nicole, Mengersen, Kerrie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4287494/
https://www.ncbi.nlm.nih.gov/pubmed/25410053
http://dx.doi.org/10.1186/1476-072X-13-47
_version_ 1782351796711194624
author Baker, Jannah
White, Nicole
Mengersen, Kerrie
author_facet Baker, Jannah
White, Nicole
Mengersen, Kerrie
author_sort Baker, Jannah
collection PubMed
description BACKGROUND: Spatial analysis is increasingly important for identifying modifiable geographic risk factors for disease. However, spatial health data from surveys are often incomplete, ranging from missing data for only a few variables, to missing data for many variables. For spatial analyses of health outcomes, selection of an appropriate imputation method is critical in order to produce the most accurate inferences. METHODS: We present a cross-validation approach to select between three imputation methods for health survey data with correlated lifestyle covariates, using as a case study, type II diabetes mellitus (DM II) risk across 71 Queensland Local Government Areas (LGAs). We compare the accuracy of mean imputation to imputation using multivariate normal and conditional autoregressive prior distributions. RESULTS: Choice of imputation method depends upon the application and is not necessarily the most complex method. Mean imputation was selected as the most accurate method in this application. CONCLUSIONS: Selecting an appropriate imputation method for health survey data, after accounting for spatial correlation and correlation between covariates, allows more complete analysis of geographic risk factors for disease with more confidence in the results to inform public policy decision-making. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1476-072X-13-47) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4287494
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42874942015-01-09 Missing in space: an evaluation of imputation methods for missing data in spatial analysis of risk factors for type II diabetes Baker, Jannah White, Nicole Mengersen, Kerrie Int J Health Geogr Research BACKGROUND: Spatial analysis is increasingly important for identifying modifiable geographic risk factors for disease. However, spatial health data from surveys are often incomplete, ranging from missing data for only a few variables, to missing data for many variables. For spatial analyses of health outcomes, selection of an appropriate imputation method is critical in order to produce the most accurate inferences. METHODS: We present a cross-validation approach to select between three imputation methods for health survey data with correlated lifestyle covariates, using as a case study, type II diabetes mellitus (DM II) risk across 71 Queensland Local Government Areas (LGAs). We compare the accuracy of mean imputation to imputation using multivariate normal and conditional autoregressive prior distributions. RESULTS: Choice of imputation method depends upon the application and is not necessarily the most complex method. Mean imputation was selected as the most accurate method in this application. CONCLUSIONS: Selecting an appropriate imputation method for health survey data, after accounting for spatial correlation and correlation between covariates, allows more complete analysis of geographic risk factors for disease with more confidence in the results to inform public policy decision-making. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1476-072X-13-47) contains supplementary material, which is available to authorized users. BioMed Central 2014-11-20 /pmc/articles/PMC4287494/ /pubmed/25410053 http://dx.doi.org/10.1186/1476-072X-13-47 Text en © Baker et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Baker, Jannah
White, Nicole
Mengersen, Kerrie
Missing in space: an evaluation of imputation methods for missing data in spatial analysis of risk factors for type II diabetes
title Missing in space: an evaluation of imputation methods for missing data in spatial analysis of risk factors for type II diabetes
title_full Missing in space: an evaluation of imputation methods for missing data in spatial analysis of risk factors for type II diabetes
title_fullStr Missing in space: an evaluation of imputation methods for missing data in spatial analysis of risk factors for type II diabetes
title_full_unstemmed Missing in space: an evaluation of imputation methods for missing data in spatial analysis of risk factors for type II diabetes
title_short Missing in space: an evaluation of imputation methods for missing data in spatial analysis of risk factors for type II diabetes
title_sort missing in space: an evaluation of imputation methods for missing data in spatial analysis of risk factors for type ii diabetes
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4287494/
https://www.ncbi.nlm.nih.gov/pubmed/25410053
http://dx.doi.org/10.1186/1476-072X-13-47
work_keys_str_mv AT bakerjannah missinginspaceanevaluationofimputationmethodsformissingdatainspatialanalysisofriskfactorsfortypeiidiabetes
AT whitenicole missinginspaceanevaluationofimputationmethodsformissingdatainspatialanalysisofriskfactorsfortypeiidiabetes
AT mengersenkerrie missinginspaceanevaluationofimputationmethodsformissingdatainspatialanalysisofriskfactorsfortypeiidiabetes