Cargando…

The use of differential privacy for census data and its impact on redistricting: The case of the 2020 U.S. Census

Census statistics play a key role in public policy decisions and social science research. However, given the risk of revealing individual information, many statistical agencies are considering disclosure control methods based on differential privacy, which add noise to tabulated data. Unlike other a...

Descripción completa

Detalles Bibliográficos
Autores principales: Kenny, Christopher T., Kuriwaki, Shiro, McCartan, Cory, Rosenman, Evan T. R., Simko, Tyler, Imai, Kosuke
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Association for the Advancement of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8494446/
https://www.ncbi.nlm.nih.gov/pubmed/34613778
http://dx.doi.org/10.1126/sciadv.abk3283
_version_ 1784579312474652672
author Kenny, Christopher T.
Kuriwaki, Shiro
McCartan, Cory
Rosenman, Evan T. R.
Simko, Tyler
Imai, Kosuke
author_facet Kenny, Christopher T.
Kuriwaki, Shiro
McCartan, Cory
Rosenman, Evan T. R.
Simko, Tyler
Imai, Kosuke
author_sort Kenny, Christopher T.
collection PubMed
description Census statistics play a key role in public policy decisions and social science research. However, given the risk of revealing individual information, many statistical agencies are considering disclosure control methods based on differential privacy, which add noise to tabulated data. Unlike other applications of differential privacy, however, census statistics must be postprocessed after noise injection to be usable. We study the impact of the U.S. Census Bureau’s latest disclosure avoidance system (DAS) on a major application of census statistics, the redrawing of electoral districts. We find that the DAS systematically undercounts the population in mixed-race and mixed-partisan precincts, yielding unpredictable racial and partisan biases. While the DAS leads to a likely violation of the “One Person, One Vote” standard as currently interpreted, it does not prevent accurate predictions of an individual’s race and ethnicity. Our findings underscore the difficulty of balancing accuracy and respondent privacy in the Census.
format Online
Article
Text
id pubmed-8494446
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Association for the Advancement of Science
record_format MEDLINE/PubMed
spelling pubmed-84944462021-10-13 The use of differential privacy for census data and its impact on redistricting: The case of the 2020 U.S. Census Kenny, Christopher T. Kuriwaki, Shiro McCartan, Cory Rosenman, Evan T. R. Simko, Tyler Imai, Kosuke Sci Adv Social and Interdisciplinary Sciences Census statistics play a key role in public policy decisions and social science research. However, given the risk of revealing individual information, many statistical agencies are considering disclosure control methods based on differential privacy, which add noise to tabulated data. Unlike other applications of differential privacy, however, census statistics must be postprocessed after noise injection to be usable. We study the impact of the U.S. Census Bureau’s latest disclosure avoidance system (DAS) on a major application of census statistics, the redrawing of electoral districts. We find that the DAS systematically undercounts the population in mixed-race and mixed-partisan precincts, yielding unpredictable racial and partisan biases. While the DAS leads to a likely violation of the “One Person, One Vote” standard as currently interpreted, it does not prevent accurate predictions of an individual’s race and ethnicity. Our findings underscore the difficulty of balancing accuracy and respondent privacy in the Census. American Association for the Advancement of Science 2021-10-06 /pmc/articles/PMC8494446/ /pubmed/34613778 http://dx.doi.org/10.1126/sciadv.abk3283 Text en Copyright © 2021 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial License 4.0 (CC BY-NC). https://creativecommons.org/licenses/by-nc/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license (https://creativecommons.org/licenses/by-nc/4.0/) , which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.
spellingShingle Social and Interdisciplinary Sciences
Kenny, Christopher T.
Kuriwaki, Shiro
McCartan, Cory
Rosenman, Evan T. R.
Simko, Tyler
Imai, Kosuke
The use of differential privacy for census data and its impact on redistricting: The case of the 2020 U.S. Census
title The use of differential privacy for census data and its impact on redistricting: The case of the 2020 U.S. Census
title_full The use of differential privacy for census data and its impact on redistricting: The case of the 2020 U.S. Census
title_fullStr The use of differential privacy for census data and its impact on redistricting: The case of the 2020 U.S. Census
title_full_unstemmed The use of differential privacy for census data and its impact on redistricting: The case of the 2020 U.S. Census
title_short The use of differential privacy for census data and its impact on redistricting: The case of the 2020 U.S. Census
title_sort use of differential privacy for census data and its impact on redistricting: the case of the 2020 u.s. census
topic Social and Interdisciplinary Sciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8494446/
https://www.ncbi.nlm.nih.gov/pubmed/34613778
http://dx.doi.org/10.1126/sciadv.abk3283
work_keys_str_mv AT kennychristophert theuseofdifferentialprivacyforcensusdataanditsimpactonredistrictingthecaseofthe2020uscensus
AT kuriwakishiro theuseofdifferentialprivacyforcensusdataanditsimpactonredistrictingthecaseofthe2020uscensus
AT mccartancory theuseofdifferentialprivacyforcensusdataanditsimpactonredistrictingthecaseofthe2020uscensus
AT rosenmanevantr theuseofdifferentialprivacyforcensusdataanditsimpactonredistrictingthecaseofthe2020uscensus
AT simkotyler theuseofdifferentialprivacyforcensusdataanditsimpactonredistrictingthecaseofthe2020uscensus
AT imaikosuke theuseofdifferentialprivacyforcensusdataanditsimpactonredistrictingthecaseofthe2020uscensus
AT kennychristophert useofdifferentialprivacyforcensusdataanditsimpactonredistrictingthecaseofthe2020uscensus
AT kuriwakishiro useofdifferentialprivacyforcensusdataanditsimpactonredistrictingthecaseofthe2020uscensus
AT mccartancory useofdifferentialprivacyforcensusdataanditsimpactonredistrictingthecaseofthe2020uscensus
AT rosenmanevantr useofdifferentialprivacyforcensusdataanditsimpactonredistrictingthecaseofthe2020uscensus
AT simkotyler useofdifferentialprivacyforcensusdataanditsimpactonredistrictingthecaseofthe2020uscensus
AT imaikosuke useofdifferentialprivacyforcensusdataanditsimpactonredistrictingthecaseofthe2020uscensus