Cargando…
Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery
Objective To propose a new approach to privacy preserving data selection, which helps the data users access human genomic datasets efficiently without undermining patients’ privacy. Methods Our idea is to let each data owner publish a set of differentially-private pilot data, on which a data user ca...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4433380/ https://www.ncbi.nlm.nih.gov/pubmed/25352565 http://dx.doi.org/10.1136/amiajnl-2014-003043 |
_version_ | 1782371636596441088 |
---|---|
author | Zhao, Yongan Wang, Xiaofeng Jiang, Xiaoqian Ohno-Machado, Lucila Tang, Haixu |
author_facet | Zhao, Yongan Wang, Xiaofeng Jiang, Xiaoqian Ohno-Machado, Lucila Tang, Haixu |
author_sort | Zhao, Yongan |
collection | PubMed |
description | Objective To propose a new approach to privacy preserving data selection, which helps the data users access human genomic datasets efficiently without undermining patients’ privacy. Methods Our idea is to let each data owner publish a set of differentially-private pilot data, on which a data user can test-run arbitrary association-test algorithms, including those not known to the data owner a priori. We developed a suite of new techniques, including a pilot-data generation approach that leverages the linkage disequilibrium in the human genome to preserve both the utility of the data and the privacy of the patients, and a utility evaluation method that helps the user assess the value of the real data from its pilot version with high confidence. Results We evaluated our approach on real human genomic data using four popular association tests. Our study shows that the proposed approach can help data users make the right choices in most cases. Conclusions Even though the pilot data cannot be directly used for scientific discovery, it provides a useful indication of which datasets are more likely to be useful to data users, who can therefore approach the appropriate data owners to gain access to the data. |
format | Online Article Text |
id | pubmed-4433380 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-44333802016-01-01 Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery Zhao, Yongan Wang, Xiaofeng Jiang, Xiaoqian Ohno-Machado, Lucila Tang, Haixu J Am Med Inform Assoc Research and Applications Objective To propose a new approach to privacy preserving data selection, which helps the data users access human genomic datasets efficiently without undermining patients’ privacy. Methods Our idea is to let each data owner publish a set of differentially-private pilot data, on which a data user can test-run arbitrary association-test algorithms, including those not known to the data owner a priori. We developed a suite of new techniques, including a pilot-data generation approach that leverages the linkage disequilibrium in the human genome to preserve both the utility of the data and the privacy of the patients, and a utility evaluation method that helps the user assess the value of the real data from its pilot version with high confidence. Results We evaluated our approach on real human genomic data using four popular association tests. Our study shows that the proposed approach can help data users make the right choices in most cases. Conclusions Even though the pilot data cannot be directly used for scientific discovery, it provides a useful indication of which datasets are more likely to be useful to data users, who can therefore approach the appropriate data owners to gain access to the data. Oxford University Press 2015-01 2014-10-28 /pmc/articles/PMC4433380/ /pubmed/25352565 http://dx.doi.org/10.1136/amiajnl-2014-003043 Text en © The Author 2014. Published by Oxford University Press on behalf of the American Medical Informatics Association. http://creativecommons.org/licenses/by-nc/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.comFor numbered affiliations see end of article. |
spellingShingle | Research and Applications Zhao, Yongan Wang, Xiaofeng Jiang, Xiaoqian Ohno-Machado, Lucila Tang, Haixu Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery |
title | Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery |
title_full | Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery |
title_fullStr | Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery |
title_full_unstemmed | Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery |
title_short | Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery |
title_sort | choosing blindly but wisely: differentially private solicitation of dna datasets for disease marker discovery |
topic | Research and Applications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4433380/ https://www.ncbi.nlm.nih.gov/pubmed/25352565 http://dx.doi.org/10.1136/amiajnl-2014-003043 |
work_keys_str_mv | AT zhaoyongan choosingblindlybutwiselydifferentiallyprivatesolicitationofdnadatasetsfordiseasemarkerdiscovery AT wangxiaofeng choosingblindlybutwiselydifferentiallyprivatesolicitationofdnadatasetsfordiseasemarkerdiscovery AT jiangxiaoqian choosingblindlybutwiselydifferentiallyprivatesolicitationofdnadatasetsfordiseasemarkerdiscovery AT ohnomachadolucila choosingblindlybutwiselydifferentiallyprivatesolicitationofdnadatasetsfordiseasemarkerdiscovery AT tanghaixu choosingblindlybutwiselydifferentiallyprivatesolicitationofdnadatasetsfordiseasemarkerdiscovery |