Cargando…

Evaluation of Privacy Risks of Patients’ Data in China: Case Study

BACKGROUND: Patient privacy is a ubiquitous problem around the world. Many existing studies have demonstrated the potential privacy risks associated with sharing of biomedical data. Owing to the increasing need for data sharing and analysis, health care data privacy is drawing more attention. Howeve...

Descripción completa

Detalles Bibliográficos
Autores principales: Gong, Mengchun, Wang, Shuang, Wang, Lezi, Liu, Chao, Wang, Jianyang, Guo, Qiang, Zheng, Hao, Xie, Kang, Wang, Chenghong, Hui, Zhouguang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7055805/
https://www.ncbi.nlm.nih.gov/pubmed/32022691
http://dx.doi.org/10.2196/13046
_version_ 1783503423793004544
author Gong, Mengchun
Wang, Shuang
Wang, Lezi
Liu, Chao
Wang, Jianyang
Guo, Qiang
Zheng, Hao
Xie, Kang
Wang, Chenghong
Hui, Zhouguang
author_facet Gong, Mengchun
Wang, Shuang
Wang, Lezi
Liu, Chao
Wang, Jianyang
Guo, Qiang
Zheng, Hao
Xie, Kang
Wang, Chenghong
Hui, Zhouguang
author_sort Gong, Mengchun
collection PubMed
description BACKGROUND: Patient privacy is a ubiquitous problem around the world. Many existing studies have demonstrated the potential privacy risks associated with sharing of biomedical data. Owing to the increasing need for data sharing and analysis, health care data privacy is drawing more attention. However, to better protect biomedical data privacy, it is essential to assess the privacy risk in the first place. OBJECTIVE: In China, there is no clear regulation for health systems to deidentify data. It is also not known whether a mechanism such as the Health Insurance Portability and Accountability Act (HIPAA) safe harbor policy will achieve sufficient protection. This study aimed to conduct a pilot study using patient data from Chinese hospitals to understand and quantify the privacy risks of Chinese patients. METHODS: We used g-distinct analysis to evaluate the reidentification risks with regard to the HIPAA safe harbor approach when applied to Chinese patients’ data. More specifically, we estimated the risks based on the HIPAA safe harbor and limited dataset policies by assuming an attacker has background knowledge of the patient from the public domain. RESULTS: The experiments were conducted on 0.83 million patients (with data field of date of birth, gender, and surrogate ZIP codes generated based on home address) across 33 provincial-level administrative divisions in China. Under the Limited Dataset policy, 19.58% (163,262/833,235) of the population could be uniquely identifiable under the g-distinct metric (ie, 1-distinct). In contrast, the Safe Harbor policy is able to significantly reduce privacy risk, where only 0.072% (601/833,235) of individuals are uniquely identifiable, and the majority of the population is 3000 indistinguishable (ie the population is expected to share common attributes with 3000 or less people). CONCLUSIONS: Through the experiments based on real-world patient data, this work illustrates that the results of g-distinct analysis about Chinese patient privacy risk are similar to those from a previous US study, in which data from different organizations/regions might be vulnerable to different reidentification risks under different policies. This work provides reference to Chinese health care entities for estimating patients’ privacy risk during data sharing, which laid the foundation of privacy risk study about Chinese patients’ data in the future.
format Online
Article
Text
id pubmed-7055805
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-70558052020-03-16 Evaluation of Privacy Risks of Patients’ Data in China: Case Study Gong, Mengchun Wang, Shuang Wang, Lezi Liu, Chao Wang, Jianyang Guo, Qiang Zheng, Hao Xie, Kang Wang, Chenghong Hui, Zhouguang JMIR Med Inform Original Paper BACKGROUND: Patient privacy is a ubiquitous problem around the world. Many existing studies have demonstrated the potential privacy risks associated with sharing of biomedical data. Owing to the increasing need for data sharing and analysis, health care data privacy is drawing more attention. However, to better protect biomedical data privacy, it is essential to assess the privacy risk in the first place. OBJECTIVE: In China, there is no clear regulation for health systems to deidentify data. It is also not known whether a mechanism such as the Health Insurance Portability and Accountability Act (HIPAA) safe harbor policy will achieve sufficient protection. This study aimed to conduct a pilot study using patient data from Chinese hospitals to understand and quantify the privacy risks of Chinese patients. METHODS: We used g-distinct analysis to evaluate the reidentification risks with regard to the HIPAA safe harbor approach when applied to Chinese patients’ data. More specifically, we estimated the risks based on the HIPAA safe harbor and limited dataset policies by assuming an attacker has background knowledge of the patient from the public domain. RESULTS: The experiments were conducted on 0.83 million patients (with data field of date of birth, gender, and surrogate ZIP codes generated based on home address) across 33 provincial-level administrative divisions in China. Under the Limited Dataset policy, 19.58% (163,262/833,235) of the population could be uniquely identifiable under the g-distinct metric (ie, 1-distinct). In contrast, the Safe Harbor policy is able to significantly reduce privacy risk, where only 0.072% (601/833,235) of individuals are uniquely identifiable, and the majority of the population is 3000 indistinguishable (ie the population is expected to share common attributes with 3000 or less people). CONCLUSIONS: Through the experiments based on real-world patient data, this work illustrates that the results of g-distinct analysis about Chinese patient privacy risk are similar to those from a previous US study, in which data from different organizations/regions might be vulnerable to different reidentification risks under different policies. This work provides reference to Chinese health care entities for estimating patients’ privacy risk during data sharing, which laid the foundation of privacy risk study about Chinese patients’ data in the future. JMIR Publications 2020-02-05 /pmc/articles/PMC7055805/ /pubmed/32022691 http://dx.doi.org/10.2196/13046 Text en ©Mengchun Gong, Shuang Wang, Lezi Wang, Chao Liu, Jianyang Wang, Qiang Guo, Hao Zheng, Kang Xie, Chenghong Wang, Zhouguang Hui. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 05.02.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Gong, Mengchun
Wang, Shuang
Wang, Lezi
Liu, Chao
Wang, Jianyang
Guo, Qiang
Zheng, Hao
Xie, Kang
Wang, Chenghong
Hui, Zhouguang
Evaluation of Privacy Risks of Patients’ Data in China: Case Study
title Evaluation of Privacy Risks of Patients’ Data in China: Case Study
title_full Evaluation of Privacy Risks of Patients’ Data in China: Case Study
title_fullStr Evaluation of Privacy Risks of Patients’ Data in China: Case Study
title_full_unstemmed Evaluation of Privacy Risks of Patients’ Data in China: Case Study
title_short Evaluation of Privacy Risks of Patients’ Data in China: Case Study
title_sort evaluation of privacy risks of patients’ data in china: case study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7055805/
https://www.ncbi.nlm.nih.gov/pubmed/32022691
http://dx.doi.org/10.2196/13046
work_keys_str_mv AT gongmengchun evaluationofprivacyrisksofpatientsdatainchinacasestudy
AT wangshuang evaluationofprivacyrisksofpatientsdatainchinacasestudy
AT wanglezi evaluationofprivacyrisksofpatientsdatainchinacasestudy
AT liuchao evaluationofprivacyrisksofpatientsdatainchinacasestudy
AT wangjianyang evaluationofprivacyrisksofpatientsdatainchinacasestudy
AT guoqiang evaluationofprivacyrisksofpatientsdatainchinacasestudy
AT zhenghao evaluationofprivacyrisksofpatientsdatainchinacasestudy
AT xiekang evaluationofprivacyrisksofpatientsdatainchinacasestudy
AT wangchenghong evaluationofprivacyrisksofpatientsdatainchinacasestudy
AT huizhouguang evaluationofprivacyrisksofpatientsdatainchinacasestudy