Cargando…

Accuracy of commercial geocoding: assessment and implications

BACKGROUND: Published studies of geocoding accuracy often focus on a single geographic area, address source or vendor, do not adjust accuracy measures for address characteristics, and do not examine effects of inaccuracy on exposure measures. We addressed these issues in a Women's Health Initia...

Descripción completa

Detalles Bibliográficos
Autores principales: Whitsel, Eric A, Quibrera, P Miguel, Smith, Richard L, Catellier, Diane J, Liao, Duanping, Henley, Amanda C, Heiss, Gerardo
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1557664/
https://www.ncbi.nlm.nih.gov/pubmed/16857050
http://dx.doi.org/10.1186/1742-5573-3-8
_version_ 1782129391265906688
author Whitsel, Eric A
Quibrera, P Miguel
Smith, Richard L
Catellier, Diane J
Liao, Duanping
Henley, Amanda C
Heiss, Gerardo
author_facet Whitsel, Eric A
Quibrera, P Miguel
Smith, Richard L
Catellier, Diane J
Liao, Duanping
Henley, Amanda C
Heiss, Gerardo
author_sort Whitsel, Eric A
collection PubMed
description BACKGROUND: Published studies of geocoding accuracy often focus on a single geographic area, address source or vendor, do not adjust accuracy measures for address characteristics, and do not examine effects of inaccuracy on exposure measures. We addressed these issues in a Women's Health Initiative ancillary study, the Environmental Epidemiology of Arrhythmogenesis in WHI. RESULTS: Addresses in 49 U.S. states (n = 3,615) with established coordinates were geocoded by four vendors (A-D). There were important differences among vendors in address match rate (98%; 82%; 81%; 30%), concordance between established and vendor-assigned census tracts (85%; 88%; 87%; 98%) and distance between established and vendor-assigned coordinates (mean ρ [meters]: 1809; 748; 704; 228). Mean ρ was lowest among street-matched, complete, zip-coded, unedited and urban addresses, and addresses with North American Datum of 1983 or World Geodetic System of 1984 coordinates. In mixed models restricted to vendors with minimally acceptable match rates (A-C) and adjusted for address characteristics, within-address correlation, and among-vendor heteroscedasticity of ρ, differences in mean ρ were small for street-type matches (280; 268; 275), i.e. likely to bias results relying on them about equally for most applications. In contrast, differences between centroid-type matches were substantial in some vendor contrasts, but not others (5497; 4303; 4210) p(interaction )< 10(-4), i.e. more likely to bias results differently in many applications. The adjusted odds of an address match was higher for vendor A versus C (odds ratio = 66, 95% confidence interval: 47, 93), but not B versus C (OR = 1.1, 95% CI: 0.9, 1.3). That of census tract concordance was no higher for vendor A versus C (OR = 1.0, 95% CI: 0.9, 1.2) or B versus C (OR = 1.1, 95% CI: 0.9, 1.3). Misclassification of a related exposure measure – distance to the nearest highway – increased with mean ρ and in the absence of confounding, non-differential misclassification of this distance biased its hypothetical association with coronary heart disease mortality toward the null. CONCLUSION: Geocoding error depends on measures used to evaluate it, address characteristics and vendor. Vendor selection presents a trade-off between potential for missing data and error in estimating spatially defined attributes. Informed selection is needed to control the trade-off and adjust analyses for its effects.
format Text
id pubmed-1557664
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15576642006-08-31 Accuracy of commercial geocoding: assessment and implications Whitsel, Eric A Quibrera, P Miguel Smith, Richard L Catellier, Diane J Liao, Duanping Henley, Amanda C Heiss, Gerardo Epidemiol Perspect Innov Research BACKGROUND: Published studies of geocoding accuracy often focus on a single geographic area, address source or vendor, do not adjust accuracy measures for address characteristics, and do not examine effects of inaccuracy on exposure measures. We addressed these issues in a Women's Health Initiative ancillary study, the Environmental Epidemiology of Arrhythmogenesis in WHI. RESULTS: Addresses in 49 U.S. states (n = 3,615) with established coordinates were geocoded by four vendors (A-D). There were important differences among vendors in address match rate (98%; 82%; 81%; 30%), concordance between established and vendor-assigned census tracts (85%; 88%; 87%; 98%) and distance between established and vendor-assigned coordinates (mean ρ [meters]: 1809; 748; 704; 228). Mean ρ was lowest among street-matched, complete, zip-coded, unedited and urban addresses, and addresses with North American Datum of 1983 or World Geodetic System of 1984 coordinates. In mixed models restricted to vendors with minimally acceptable match rates (A-C) and adjusted for address characteristics, within-address correlation, and among-vendor heteroscedasticity of ρ, differences in mean ρ were small for street-type matches (280; 268; 275), i.e. likely to bias results relying on them about equally for most applications. In contrast, differences between centroid-type matches were substantial in some vendor contrasts, but not others (5497; 4303; 4210) p(interaction )< 10(-4), i.e. more likely to bias results differently in many applications. The adjusted odds of an address match was higher for vendor A versus C (odds ratio = 66, 95% confidence interval: 47, 93), but not B versus C (OR = 1.1, 95% CI: 0.9, 1.3). That of census tract concordance was no higher for vendor A versus C (OR = 1.0, 95% CI: 0.9, 1.2) or B versus C (OR = 1.1, 95% CI: 0.9, 1.3). Misclassification of a related exposure measure – distance to the nearest highway – increased with mean ρ and in the absence of confounding, non-differential misclassification of this distance biased its hypothetical association with coronary heart disease mortality toward the null. CONCLUSION: Geocoding error depends on measures used to evaluate it, address characteristics and vendor. Vendor selection presents a trade-off between potential for missing data and error in estimating spatially defined attributes. Informed selection is needed to control the trade-off and adjust analyses for its effects. BioMed Central 2006-07-20 /pmc/articles/PMC1557664/ /pubmed/16857050 http://dx.doi.org/10.1186/1742-5573-3-8 Text en Copyright © 2006 Whitsel et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Whitsel, Eric A
Quibrera, P Miguel
Smith, Richard L
Catellier, Diane J
Liao, Duanping
Henley, Amanda C
Heiss, Gerardo
Accuracy of commercial geocoding: assessment and implications
title Accuracy of commercial geocoding: assessment and implications
title_full Accuracy of commercial geocoding: assessment and implications
title_fullStr Accuracy of commercial geocoding: assessment and implications
title_full_unstemmed Accuracy of commercial geocoding: assessment and implications
title_short Accuracy of commercial geocoding: assessment and implications
title_sort accuracy of commercial geocoding: assessment and implications
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1557664/
https://www.ncbi.nlm.nih.gov/pubmed/16857050
http://dx.doi.org/10.1186/1742-5573-3-8
work_keys_str_mv AT whitselerica accuracyofcommercialgeocodingassessmentandimplications
AT quibrerapmiguel accuracyofcommercialgeocodingassessmentandimplications
AT smithrichardl accuracyofcommercialgeocodingassessmentandimplications
AT catellierdianej accuracyofcommercialgeocodingassessmentandimplications
AT liaoduanping accuracyofcommercialgeocodingassessmentandimplications
AT henleyamandac accuracyofcommercialgeocodingassessmentandimplications
AT heissgerardo accuracyofcommercialgeocodingassessmentandimplications