Cargando…
Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study
BACKGROUND: Data privacy is a major concern in spatial epidemiology because exact residential locations or parts of participants’ addresses such as street or zip codes are used to perform geospatial analyses. To overcome this concern, different levels of aggregation such as census districts or zip c...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BMJ Publishing Group
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4360832/ https://www.ncbi.nlm.nih.gov/pubmed/25753360 http://dx.doi.org/10.1136/bmjopen-2014-006481 |
_version_ | 1782361592809127936 |
---|---|
author | Buck, Christoph Dreger, Steffen Pigeot, Iris |
author_facet | Buck, Christoph Dreger, Steffen Pigeot, Iris |
author_sort | Buck, Christoph |
collection | PubMed |
description | BACKGROUND: Data privacy is a major concern in spatial epidemiology because exact residential locations or parts of participants’ addresses such as street or zip codes are used to perform geospatial analyses. To overcome this concern, different levels of aggregation such as census districts or zip code areas are mainly used, though any spatial aggregation leads to a loss of spatial variability. For the assessment of urban opportunities for physical activity that was conducted in the IDEFICS (Identification and prevention of dietary- and lifestyle-induced health effects in children and infants) study, macrolevel analyses were performed, but the use of exact residential addresses for micro-level analyses was not permitted by the responsible office for data protection. We therefore implemented a spatial blurring to anonymise address coordinates depending on the underlying population density. METHODS: We added a standard Gaussian distributed error to individual address coordinates with the variance [Image: see text] depending on the population density and on the chosen k-anonymity. 1000 random point locations were generated and repeatedly blurred 100 times to obtain anonymised locations. For each location 1 km network-dependent neighbourhoods were used to calculate walkability indices. Indices of blurred locations were compared to indices based on their sampling origins to determine the effect of spatial blurring on the assessment of the built environment. RESULTS: Spatial blurring decreased with increasing population density. Similarly, mean differences in walkability indices also decreased with increasing population density. In particular for densely-populated areas with at least 1500 residents per km², differences between blurred locations and their sampling origins were small and did not affect the assessment of the built environment after spatial blurring. CONCLUSIONS: This approach allowed the investigation of the built environment at a microlevel using individual network-dependent neighbourhoods, while ensuring data protection requirements. Minor influence of spatial blurring on the assessment of walkability was found that slightly affected the assessment of the built environment in sparsely-populated areas. |
format | Online Article Text |
id | pubmed-4360832 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BMJ Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-43608322015-03-25 Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study Buck, Christoph Dreger, Steffen Pigeot, Iris BMJ Open Epidemiology BACKGROUND: Data privacy is a major concern in spatial epidemiology because exact residential locations or parts of participants’ addresses such as street or zip codes are used to perform geospatial analyses. To overcome this concern, different levels of aggregation such as census districts or zip code areas are mainly used, though any spatial aggregation leads to a loss of spatial variability. For the assessment of urban opportunities for physical activity that was conducted in the IDEFICS (Identification and prevention of dietary- and lifestyle-induced health effects in children and infants) study, macrolevel analyses were performed, but the use of exact residential addresses for micro-level analyses was not permitted by the responsible office for data protection. We therefore implemented a spatial blurring to anonymise address coordinates depending on the underlying population density. METHODS: We added a standard Gaussian distributed error to individual address coordinates with the variance [Image: see text] depending on the population density and on the chosen k-anonymity. 1000 random point locations were generated and repeatedly blurred 100 times to obtain anonymised locations. For each location 1 km network-dependent neighbourhoods were used to calculate walkability indices. Indices of blurred locations were compared to indices based on their sampling origins to determine the effect of spatial blurring on the assessment of the built environment. RESULTS: Spatial blurring decreased with increasing population density. Similarly, mean differences in walkability indices also decreased with increasing population density. In particular for densely-populated areas with at least 1500 residents per km², differences between blurred locations and their sampling origins were small and did not affect the assessment of the built environment after spatial blurring. CONCLUSIONS: This approach allowed the investigation of the built environment at a microlevel using individual network-dependent neighbourhoods, while ensuring data protection requirements. Minor influence of spatial blurring on the assessment of walkability was found that slightly affected the assessment of the built environment in sparsely-populated areas. BMJ Publishing Group 2015-03-07 /pmc/articles/PMC4360832/ /pubmed/25753360 http://dx.doi.org/10.1136/bmjopen-2014-006481 Text en Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ |
spellingShingle | Epidemiology Buck, Christoph Dreger, Steffen Pigeot, Iris Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study |
title | Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study |
title_full | Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study |
title_fullStr | Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study |
title_full_unstemmed | Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study |
title_short | Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study |
title_sort | anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study |
topic | Epidemiology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4360832/ https://www.ncbi.nlm.nih.gov/pubmed/25753360 http://dx.doi.org/10.1136/bmjopen-2014-006481 |
work_keys_str_mv | AT buckchristoph anonymisationofaddresscoordinatesformicrolevelanalysesofthebuiltenvironmentasimulationstudy AT dregersteffen anonymisationofaddresscoordinatesformicrolevelanalysesofthebuiltenvironmentasimulationstudy AT pigeotiris anonymisationofaddresscoordinatesformicrolevelanalysesofthebuiltenvironmentasimulationstudy |