Cargando…

Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study

BACKGROUND: Data privacy is a major concern in spatial epidemiology because exact residential locations or parts of participants’ addresses such as street or zip codes are used to perform geospatial analyses. To overcome this concern, different levels of aggregation such as census districts or zip c...

Descripción completa

Detalles Bibliográficos
Autores principales: Buck, Christoph, Dreger, Steffen, Pigeot, Iris
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4360832/
https://www.ncbi.nlm.nih.gov/pubmed/25753360
http://dx.doi.org/10.1136/bmjopen-2014-006481
_version_ 1782361592809127936
author Buck, Christoph
Dreger, Steffen
Pigeot, Iris
author_facet Buck, Christoph
Dreger, Steffen
Pigeot, Iris
author_sort Buck, Christoph
collection PubMed
description BACKGROUND: Data privacy is a major concern in spatial epidemiology because exact residential locations or parts of participants’ addresses such as street or zip codes are used to perform geospatial analyses. To overcome this concern, different levels of aggregation such as census districts or zip code areas are mainly used, though any spatial aggregation leads to a loss of spatial variability. For the assessment of urban opportunities for physical activity that was conducted in the IDEFICS (Identification and prevention of dietary- and lifestyle-induced health effects in children and infants) study, macrolevel analyses were performed, but the use of exact residential addresses for micro-level analyses was not permitted by the responsible office for data protection. We therefore implemented a spatial blurring to anonymise address coordinates depending on the underlying population density. METHODS: We added a standard Gaussian distributed error to individual address coordinates with the variance [Image: see text] depending on the population density and on the chosen k-anonymity. 1000 random point locations were generated and repeatedly blurred 100 times to obtain anonymised locations. For each location 1 km network-dependent neighbourhoods were used to calculate walkability indices. Indices of blurred locations were compared to indices based on their sampling origins to determine the effect of spatial blurring on the assessment of the built environment. RESULTS: Spatial blurring decreased with increasing population density. Similarly, mean differences in walkability indices also decreased with increasing population density. In particular for densely-populated areas with at least 1500 residents per km², differences between blurred locations and their sampling origins were small and did not affect the assessment of the built environment after spatial blurring. CONCLUSIONS: This approach allowed the investigation of the built environment at a microlevel using individual network-dependent neighbourhoods, while ensuring data protection requirements. Minor influence of spatial blurring on the assessment of walkability was found that slightly affected the assessment of the built environment in sparsely-populated areas.
format Online
Article
Text
id pubmed-4360832
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BMJ Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-43608322015-03-25 Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study Buck, Christoph Dreger, Steffen Pigeot, Iris BMJ Open Epidemiology BACKGROUND: Data privacy is a major concern in spatial epidemiology because exact residential locations or parts of participants’ addresses such as street or zip codes are used to perform geospatial analyses. To overcome this concern, different levels of aggregation such as census districts or zip code areas are mainly used, though any spatial aggregation leads to a loss of spatial variability. For the assessment of urban opportunities for physical activity that was conducted in the IDEFICS (Identification and prevention of dietary- and lifestyle-induced health effects in children and infants) study, macrolevel analyses were performed, but the use of exact residential addresses for micro-level analyses was not permitted by the responsible office for data protection. We therefore implemented a spatial blurring to anonymise address coordinates depending on the underlying population density. METHODS: We added a standard Gaussian distributed error to individual address coordinates with the variance [Image: see text] depending on the population density and on the chosen k-anonymity. 1000 random point locations were generated and repeatedly blurred 100 times to obtain anonymised locations. For each location 1 km network-dependent neighbourhoods were used to calculate walkability indices. Indices of blurred locations were compared to indices based on their sampling origins to determine the effect of spatial blurring on the assessment of the built environment. RESULTS: Spatial blurring decreased with increasing population density. Similarly, mean differences in walkability indices also decreased with increasing population density. In particular for densely-populated areas with at least 1500 residents per km², differences between blurred locations and their sampling origins were small and did not affect the assessment of the built environment after spatial blurring. CONCLUSIONS: This approach allowed the investigation of the built environment at a microlevel using individual network-dependent neighbourhoods, while ensuring data protection requirements. Minor influence of spatial blurring on the assessment of walkability was found that slightly affected the assessment of the built environment in sparsely-populated areas. BMJ Publishing Group 2015-03-07 /pmc/articles/PMC4360832/ /pubmed/25753360 http://dx.doi.org/10.1136/bmjopen-2014-006481 Text en Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
spellingShingle Epidemiology
Buck, Christoph
Dreger, Steffen
Pigeot, Iris
Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study
title Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study
title_full Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study
title_fullStr Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study
title_full_unstemmed Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study
title_short Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study
title_sort anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study
topic Epidemiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4360832/
https://www.ncbi.nlm.nih.gov/pubmed/25753360
http://dx.doi.org/10.1136/bmjopen-2014-006481
work_keys_str_mv AT buckchristoph anonymisationofaddresscoordinatesformicrolevelanalysesofthebuiltenvironmentasimulationstudy
AT dregersteffen anonymisationofaddresscoordinatesformicrolevelanalysesofthebuiltenvironmentasimulationstudy
AT pigeotiris anonymisationofaddresscoordinatesformicrolevelanalysesofthebuiltenvironmentasimulationstudy