Cargando…

A Game Theoretic Framework for Analyzing Re-Identification Risk

Given the potential wealth of insights in personal data the big databases can provide, many organizations aim to share data while protecting privacy by sharing de-identified data, but are concerned because various demonstrations show such data can be re-identified. Yet these investigations focus on...

Descripción completa

Detalles Bibliográficos
Autores principales: Wan, Zhiyu, Vorobeychik, Yevgeniy, Xia, Weiyi, Clayton, Ellen Wright, Kantarcioglu, Murat, Ganta, Ranjit, Heatherly, Raymond, Malin, Bradley A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4373733/
https://www.ncbi.nlm.nih.gov/pubmed/25807380
http://dx.doi.org/10.1371/journal.pone.0120592
_version_ 1782363373504036864
author Wan, Zhiyu
Vorobeychik, Yevgeniy
Xia, Weiyi
Clayton, Ellen Wright
Kantarcioglu, Murat
Ganta, Ranjit
Heatherly, Raymond
Malin, Bradley A.
author_facet Wan, Zhiyu
Vorobeychik, Yevgeniy
Xia, Weiyi
Clayton, Ellen Wright
Kantarcioglu, Murat
Ganta, Ranjit
Heatherly, Raymond
Malin, Bradley A.
author_sort Wan, Zhiyu
collection PubMed
description Given the potential wealth of insights in personal data the big databases can provide, many organizations aim to share data while protecting privacy by sharing de-identified data, but are concerned because various demonstrations show such data can be re-identified. Yet these investigations focus on how attacks can be perpetrated, not the likelihood they will be realized. This paper introduces a game theoretic framework that enables a publisher to balance re-identification risk with the value of sharing data, leveraging a natural assumption that a recipient only attempts re-identification if its potential gains outweigh the costs. We apply the framework to a real case study, where the value of the data to the publisher is the actual grant funding dollar amounts from a national sponsor and the re-identification gain of the recipient is the fine paid to a regulator for violation of federal privacy rules. There are three notable findings: 1) it is possible to achieve zero risk, in that the recipient never gains from re-identification, while sharing almost as much data as the optimal solution that allows for a small amount of risk; 2) the zero-risk solution enables sharing much more data than a commonly invoked de-identification policy of the U.S. Health Insurance Portability and Accountability Act (HIPAA); and 3) a sensitivity analysis demonstrates these findings are robust to order-of-magnitude changes in player losses and gains. In combination, these findings provide support that such a framework can enable pragmatic policy decisions about de-identified data sharing.
format Online
Article
Text
id pubmed-4373733
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-43737332015-03-27 A Game Theoretic Framework for Analyzing Re-Identification Risk Wan, Zhiyu Vorobeychik, Yevgeniy Xia, Weiyi Clayton, Ellen Wright Kantarcioglu, Murat Ganta, Ranjit Heatherly, Raymond Malin, Bradley A. PLoS One Research Article Given the potential wealth of insights in personal data the big databases can provide, many organizations aim to share data while protecting privacy by sharing de-identified data, but are concerned because various demonstrations show such data can be re-identified. Yet these investigations focus on how attacks can be perpetrated, not the likelihood they will be realized. This paper introduces a game theoretic framework that enables a publisher to balance re-identification risk with the value of sharing data, leveraging a natural assumption that a recipient only attempts re-identification if its potential gains outweigh the costs. We apply the framework to a real case study, where the value of the data to the publisher is the actual grant funding dollar amounts from a national sponsor and the re-identification gain of the recipient is the fine paid to a regulator for violation of federal privacy rules. There are three notable findings: 1) it is possible to achieve zero risk, in that the recipient never gains from re-identification, while sharing almost as much data as the optimal solution that allows for a small amount of risk; 2) the zero-risk solution enables sharing much more data than a commonly invoked de-identification policy of the U.S. Health Insurance Portability and Accountability Act (HIPAA); and 3) a sensitivity analysis demonstrates these findings are robust to order-of-magnitude changes in player losses and gains. In combination, these findings provide support that such a framework can enable pragmatic policy decisions about de-identified data sharing. Public Library of Science 2015-03-25 /pmc/articles/PMC4373733/ /pubmed/25807380 http://dx.doi.org/10.1371/journal.pone.0120592 Text en © 2015 Wan et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Wan, Zhiyu
Vorobeychik, Yevgeniy
Xia, Weiyi
Clayton, Ellen Wright
Kantarcioglu, Murat
Ganta, Ranjit
Heatherly, Raymond
Malin, Bradley A.
A Game Theoretic Framework for Analyzing Re-Identification Risk
title A Game Theoretic Framework for Analyzing Re-Identification Risk
title_full A Game Theoretic Framework for Analyzing Re-Identification Risk
title_fullStr A Game Theoretic Framework for Analyzing Re-Identification Risk
title_full_unstemmed A Game Theoretic Framework for Analyzing Re-Identification Risk
title_short A Game Theoretic Framework for Analyzing Re-Identification Risk
title_sort game theoretic framework for analyzing re-identification risk
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4373733/
https://www.ncbi.nlm.nih.gov/pubmed/25807380
http://dx.doi.org/10.1371/journal.pone.0120592
work_keys_str_mv AT wanzhiyu agametheoreticframeworkforanalyzingreidentificationrisk
AT vorobeychikyevgeniy agametheoreticframeworkforanalyzingreidentificationrisk
AT xiaweiyi agametheoreticframeworkforanalyzingreidentificationrisk
AT claytonellenwright agametheoreticframeworkforanalyzingreidentificationrisk
AT kantarcioglumurat agametheoreticframeworkforanalyzingreidentificationrisk
AT gantaranjit agametheoreticframeworkforanalyzingreidentificationrisk
AT heatherlyraymond agametheoreticframeworkforanalyzingreidentificationrisk
AT malinbradleya agametheoreticframeworkforanalyzingreidentificationrisk
AT wanzhiyu gametheoreticframeworkforanalyzingreidentificationrisk
AT vorobeychikyevgeniy gametheoreticframeworkforanalyzingreidentificationrisk
AT xiaweiyi gametheoreticframeworkforanalyzingreidentificationrisk
AT claytonellenwright gametheoreticframeworkforanalyzingreidentificationrisk
AT kantarcioglumurat gametheoreticframeworkforanalyzingreidentificationrisk
AT gantaranjit gametheoreticframeworkforanalyzingreidentificationrisk
AT heatherlyraymond gametheoreticframeworkforanalyzingreidentificationrisk
AT malinbradleya gametheoreticframeworkforanalyzingreidentificationrisk