Cargando…

Privacy-Preserving Deep Learning for the Detection of Protected Health Information in Real-World Data: Comparative Evaluation

BACKGROUND: Collaborative privacy-preserving training methods allow for the integration of locally stored private data sets into machine learning approaches while ensuring confidentiality and nondisclosure. OBJECTIVE: In this work we assess the performance of a state-of-the-art neural network approa...

Descripción completa

Detalles Bibliográficos
Autores principales:	Festag, Sven, Spreckelsen, Cord
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2020
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7238077/ https://www.ncbi.nlm.nih.gov/pubmed/32369025 http://dx.doi.org/10.2196/14064

_version_	1783536460476973056
author	Festag, Sven Spreckelsen, Cord
author_facet	Festag, Sven Spreckelsen, Cord
author_sort	Festag, Sven
collection	PubMed
description	BACKGROUND: Collaborative privacy-preserving training methods allow for the integration of locally stored private data sets into machine learning approaches while ensuring confidentiality and nondisclosure. OBJECTIVE: In this work we assess the performance of a state-of-the-art neural network approach for the detection of protected health information in texts trained in a collaborative privacy-preserving way. METHODS: The training adopts distributed selective stochastic gradient descent (ie, it works by exchanging local learning results achieved on private data sets). Five networks were trained on separated real-world clinical data sets by using the privacy-protecting protocol. In total, the data sets contain 1304 real longitudinal patient records for 296 patients. RESULTS: These networks reached a mean F1 value of 0.955. The gold standard centralized training that is based on the union of all sets and does not take data security into consideration reaches a final value of 0.962. CONCLUSIONS: Using real-world clinical data, our study shows that detection of protected health information can be secured by collaborative privacy-preserving training. In general, the approach shows the feasibility of deep learning on distributed and confidential clinical data while ensuring data protection.
format	Online Article Text
id	pubmed-7238077
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-72380772020-06-01 Privacy-Preserving Deep Learning for the Detection of Protected Health Information in Real-World Data: Comparative Evaluation Festag, Sven Spreckelsen, Cord JMIR Form Res Original Paper BACKGROUND: Collaborative privacy-preserving training methods allow for the integration of locally stored private data sets into machine learning approaches while ensuring confidentiality and nondisclosure. OBJECTIVE: In this work we assess the performance of a state-of-the-art neural network approach for the detection of protected health information in texts trained in a collaborative privacy-preserving way. METHODS: The training adopts distributed selective stochastic gradient descent (ie, it works by exchanging local learning results achieved on private data sets). Five networks were trained on separated real-world clinical data sets by using the privacy-protecting protocol. In total, the data sets contain 1304 real longitudinal patient records for 296 patients. RESULTS: These networks reached a mean F1 value of 0.955. The gold standard centralized training that is based on the union of all sets and does not take data security into consideration reaches a final value of 0.962. CONCLUSIONS: Using real-world clinical data, our study shows that detection of protected health information can be secured by collaborative privacy-preserving training. In general, the approach shows the feasibility of deep learning on distributed and confidential clinical data while ensuring data protection. JMIR Publications 2020-05-05 /pmc/articles/PMC7238077/ /pubmed/32369025 http://dx.doi.org/10.2196/14064 Text en ©Sven Festag, Cord Spreckelsen. Originally published in JMIR Formative Research (http://formative.jmir.org), 05.05.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on http://formative.jmir.org, as well as this copyright and license information must be included.
spellingShingle	Original Paper Festag, Sven Spreckelsen, Cord Privacy-Preserving Deep Learning for the Detection of Protected Health Information in Real-World Data: Comparative Evaluation
title	Privacy-Preserving Deep Learning for the Detection of Protected Health Information in Real-World Data: Comparative Evaluation
title_full	Privacy-Preserving Deep Learning for the Detection of Protected Health Information in Real-World Data: Comparative Evaluation
title_fullStr	Privacy-Preserving Deep Learning for the Detection of Protected Health Information in Real-World Data: Comparative Evaluation
title_full_unstemmed	Privacy-Preserving Deep Learning for the Detection of Protected Health Information in Real-World Data: Comparative Evaluation
title_short	Privacy-Preserving Deep Learning for the Detection of Protected Health Information in Real-World Data: Comparative Evaluation
title_sort	privacy-preserving deep learning for the detection of protected health information in real-world data: comparative evaluation
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7238077/ https://www.ncbi.nlm.nih.gov/pubmed/32369025 http://dx.doi.org/10.2196/14064
work_keys_str_mv	AT festagsven privacypreservingdeeplearningforthedetectionofprotectedhealthinformationinrealworlddatacomparativeevaluation AT spreckelsencord privacypreservingdeeplearningforthedetectionofprotectedhealthinformationinrealworlddatacomparativeevaluation

Privacy-Preserving Deep Learning for the Detection of Protected Health Information in Real-World Data: Comparative Evaluation

Ejemplares similares