Cargando…

Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing

BACKGROUND: Data sharing accelerates scientific progress but sharing individual-level data while preserving patient privacy presents a barrier. METHODS AND RESULTS: Using pairs of deep neural networks, we generated simulated, synthetic participants that closely resemble participants of the SPRINT tr...

Descripción completa

Detalles Bibliográficos
Autores principales: Beaulieu-Jones, Brett K., Wu, Zhiwei Steven, Williams, Chris, Lee, Ran, Bhavnani, Sanjeev P., Byrd, James Brian, Greene, Casey S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Lippincott Williams & Wilkins 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7041894/
https://www.ncbi.nlm.nih.gov/pubmed/31284738
http://dx.doi.org/10.1161/CIRCOUTCOMES.118.005122
_version_ 1783501225577152512
author Beaulieu-Jones, Brett K.
Wu, Zhiwei Steven
Williams, Chris
Lee, Ran
Bhavnani, Sanjeev P.
Byrd, James Brian
Greene, Casey S.
author_facet Beaulieu-Jones, Brett K.
Wu, Zhiwei Steven
Williams, Chris
Lee, Ran
Bhavnani, Sanjeev P.
Byrd, James Brian
Greene, Casey S.
author_sort Beaulieu-Jones, Brett K.
collection PubMed
description BACKGROUND: Data sharing accelerates scientific progress but sharing individual-level data while preserving patient privacy presents a barrier. METHODS AND RESULTS: Using pairs of deep neural networks, we generated simulated, synthetic participants that closely resemble participants of the SPRINT trial (Systolic Blood Pressure Trial). We showed that such paired networks can be trained with differential privacy, a formal privacy framework that limits the likelihood that queries of the synthetic participants’ data could identify a real a participant in the trial. Machine learning predictors built on the synthetic population generalize to the original data set. This finding suggests that the synthetic data can be shared with others, enabling them to perform hypothesis-generating analyses as though they had the original trial data. CONCLUSIONS: Deep neural networks that generate synthetic participants facilitate secondary analyses and reproducible investigation of clinical data sets by enhancing data sharing while preserving participant privacy.
format Online
Article
Text
id pubmed-7041894
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Lippincott Williams & Wilkins
record_format MEDLINE/PubMed
spelling pubmed-70418942020-07-09 Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing Beaulieu-Jones, Brett K. Wu, Zhiwei Steven Williams, Chris Lee, Ran Bhavnani, Sanjeev P. Byrd, James Brian Greene, Casey S. Circ Cardiovasc Qual Outcomes Methods Paper BACKGROUND: Data sharing accelerates scientific progress but sharing individual-level data while preserving patient privacy presents a barrier. METHODS AND RESULTS: Using pairs of deep neural networks, we generated simulated, synthetic participants that closely resemble participants of the SPRINT trial (Systolic Blood Pressure Trial). We showed that such paired networks can be trained with differential privacy, a formal privacy framework that limits the likelihood that queries of the synthetic participants’ data could identify a real a participant in the trial. Machine learning predictors built on the synthetic population generalize to the original data set. This finding suggests that the synthetic data can be shared with others, enabling them to perform hypothesis-generating analyses as though they had the original trial data. CONCLUSIONS: Deep neural networks that generate synthetic participants facilitate secondary analyses and reproducible investigation of clinical data sets by enhancing data sharing while preserving participant privacy. Lippincott Williams & Wilkins 2019-07 2019-07-09 /pmc/articles/PMC7041894/ /pubmed/31284738 http://dx.doi.org/10.1161/CIRCOUTCOMES.118.005122 Text en © 2019 The Authors. Circulation: Cardiovascular Quality and Outcomes is published on behalf of the American Heart Association, Inc., by Wolters Kluwer Health, Inc. This is an open access article under the terms of the Creative Commons Attribution (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution, and reproduction in any medium, provided that the original work is properly cited.
spellingShingle Methods Paper
Beaulieu-Jones, Brett K.
Wu, Zhiwei Steven
Williams, Chris
Lee, Ran
Bhavnani, Sanjeev P.
Byrd, James Brian
Greene, Casey S.
Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing
title Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing
title_full Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing
title_fullStr Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing
title_full_unstemmed Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing
title_short Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing
title_sort privacy-preserving generative deep neural networks support clinical data sharing
topic Methods Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7041894/
https://www.ncbi.nlm.nih.gov/pubmed/31284738
http://dx.doi.org/10.1161/CIRCOUTCOMES.118.005122
work_keys_str_mv AT beaulieujonesbrettk privacypreservinggenerativedeepneuralnetworkssupportclinicaldatasharing
AT wuzhiweisteven privacypreservinggenerativedeepneuralnetworkssupportclinicaldatasharing
AT williamschris privacypreservinggenerativedeepneuralnetworkssupportclinicaldatasharing
AT leeran privacypreservinggenerativedeepneuralnetworkssupportclinicaldatasharing
AT bhavnanisanjeevp privacypreservinggenerativedeepneuralnetworkssupportclinicaldatasharing
AT byrdjamesbrian privacypreservinggenerativedeepneuralnetworkssupportclinicaldatasharing
AT greenecaseys privacypreservinggenerativedeepneuralnetworkssupportclinicaldatasharing