Cargando…

Balancing data privacy and usability in the federal statistical system

The federal statistical system is experiencing competing pressures for change. On the one hand, for confidentiality reasons, much socially valuable data currently held by federal agencies is either not made available to researchers at all or only made available under onerous conditions. On the other...

Descripción completa

Detalles Bibliográficos
Autores principales: Hotz, V. Joseph, Bollinger, Christopher R., Komarova, Tatiana, Manski, Charles F., Moffitt, Robert A., Nekipelov, Denis, Sojourner, Aaron, Spencer, Bruce D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: National Academy of Sciences 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9351352/
https://www.ncbi.nlm.nih.gov/pubmed/35878030
http://dx.doi.org/10.1073/pnas.2104906119
_version_ 1784762426725498880
author Hotz, V. Joseph
Bollinger, Christopher R.
Komarova, Tatiana
Manski, Charles F.
Moffitt, Robert A.
Nekipelov, Denis
Sojourner, Aaron
Spencer, Bruce D.
author_facet Hotz, V. Joseph
Bollinger, Christopher R.
Komarova, Tatiana
Manski, Charles F.
Moffitt, Robert A.
Nekipelov, Denis
Sojourner, Aaron
Spencer, Bruce D.
author_sort Hotz, V. Joseph
collection PubMed
description The federal statistical system is experiencing competing pressures for change. On the one hand, for confidentiality reasons, much socially valuable data currently held by federal agencies is either not made available to researchers at all or only made available under onerous conditions. On the other hand, agencies which release public databases face new challenges in protecting the privacy of the subjects in those databases, which leads them to consider releasing fewer data or masking the data in ways that will reduce their accuracy. In this essay, we argue that the discussion has not given proper consideration to the reduced social benefits of data availability and their usability relative to the value of increased levels of privacy protection. A more balanced benefit–cost framework should be used to assess these trade-offs. We express concerns both with synthetic data methods for disclosure limitation, which will reduce the types of research that can be reliably conducted in unknown ways, and with differential privacy criteria that use what we argue is an inappropriate measure of disclosure risk. We recommend that the measure of disclosure risk used to assess all disclosure protection methods focus on what we believe is the risk that individuals should care about, that more study of the impact of differential privacy criteria and synthetic data methods on data usability for research be conducted before either is put into widespread use, and that more research be conducted on alternative methods of disclosure risk reduction that better balance benefits and costs.
format Online
Article
Text
id pubmed-9351352
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher National Academy of Sciences
record_format MEDLINE/PubMed
spelling pubmed-93513522023-01-25 Balancing data privacy and usability in the federal statistical system Hotz, V. Joseph Bollinger, Christopher R. Komarova, Tatiana Manski, Charles F. Moffitt, Robert A. Nekipelov, Denis Sojourner, Aaron Spencer, Bruce D. Proc Natl Acad Sci U S A Perspective The federal statistical system is experiencing competing pressures for change. On the one hand, for confidentiality reasons, much socially valuable data currently held by federal agencies is either not made available to researchers at all or only made available under onerous conditions. On the other hand, agencies which release public databases face new challenges in protecting the privacy of the subjects in those databases, which leads them to consider releasing fewer data or masking the data in ways that will reduce their accuracy. In this essay, we argue that the discussion has not given proper consideration to the reduced social benefits of data availability and their usability relative to the value of increased levels of privacy protection. A more balanced benefit–cost framework should be used to assess these trade-offs. We express concerns both with synthetic data methods for disclosure limitation, which will reduce the types of research that can be reliably conducted in unknown ways, and with differential privacy criteria that use what we argue is an inappropriate measure of disclosure risk. We recommend that the measure of disclosure risk used to assess all disclosure protection methods focus on what we believe is the risk that individuals should care about, that more study of the impact of differential privacy criteria and synthetic data methods on data usability for research be conducted before either is put into widespread use, and that more research be conducted on alternative methods of disclosure risk reduction that better balance benefits and costs. National Academy of Sciences 2022-07-25 2022-08-02 /pmc/articles/PMC9351352/ /pubmed/35878030 http://dx.doi.org/10.1073/pnas.2104906119 Text en Copyright © 2022 the Author(s). Published by PNAS. https://creativecommons.org/licenses/by-nc-nd/4.0/This article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle Perspective
Hotz, V. Joseph
Bollinger, Christopher R.
Komarova, Tatiana
Manski, Charles F.
Moffitt, Robert A.
Nekipelov, Denis
Sojourner, Aaron
Spencer, Bruce D.
Balancing data privacy and usability in the federal statistical system
title Balancing data privacy and usability in the federal statistical system
title_full Balancing data privacy and usability in the federal statistical system
title_fullStr Balancing data privacy and usability in the federal statistical system
title_full_unstemmed Balancing data privacy and usability in the federal statistical system
title_short Balancing data privacy and usability in the federal statistical system
title_sort balancing data privacy and usability in the federal statistical system
topic Perspective
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9351352/
https://www.ncbi.nlm.nih.gov/pubmed/35878030
http://dx.doi.org/10.1073/pnas.2104906119
work_keys_str_mv AT hotzvjoseph balancingdataprivacyandusabilityinthefederalstatisticalsystem
AT bollingerchristopherr balancingdataprivacyandusabilityinthefederalstatisticalsystem
AT komarovatatiana balancingdataprivacyandusabilityinthefederalstatisticalsystem
AT manskicharlesf balancingdataprivacyandusabilityinthefederalstatisticalsystem
AT moffittroberta balancingdataprivacyandusabilityinthefederalstatisticalsystem
AT nekipelovdenis balancingdataprivacyandusabilityinthefederalstatisticalsystem
AT sojourneraaron balancingdataprivacyandusabilityinthefederalstatisticalsystem
AT spencerbruced balancingdataprivacyandusabilityinthefederalstatisticalsystem