Cargando…

A Privacy-Preserving Distributed Analytics Platform for Health Care Data

Background  In recent years, data-driven medicine has gained increasing importance in terms of diagnosis, treatment, and research due to the exponential growth of health care data. However, data protection regulations prohibit data centralisation for analysis purposes because of potential privacy ri...

Descripción completa

Detalles Bibliográficos
Autores principales: Welten, Sascha, Mou, Yongli, Neumann, Laurenz, Jaberansary, Mehrshad, Yediel Ucer, Yeliz, Kirsten, Toralf, Decker, Stefan, Beyan, Oya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Georg Thieme Verlag KG 2022
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9246511/
https://www.ncbi.nlm.nih.gov/pubmed/35038764
http://dx.doi.org/10.1055/s-0041-1740564
_version_ 1784738984876834816
author Welten, Sascha
Mou, Yongli
Neumann, Laurenz
Jaberansary, Mehrshad
Yediel Ucer, Yeliz
Kirsten, Toralf
Decker, Stefan
Beyan, Oya
author_facet Welten, Sascha
Mou, Yongli
Neumann, Laurenz
Jaberansary, Mehrshad
Yediel Ucer, Yeliz
Kirsten, Toralf
Decker, Stefan
Beyan, Oya
author_sort Welten, Sascha
collection PubMed
description Background  In recent years, data-driven medicine has gained increasing importance in terms of diagnosis, treatment, and research due to the exponential growth of health care data. However, data protection regulations prohibit data centralisation for analysis purposes because of potential privacy risks like the accidental disclosure of data to third parties. Therefore, alternative data usage policies, which comply with present privacy guidelines, are of particular interest. Objective  We aim to enable analyses on sensitive patient data by simultaneously complying with local data protection regulations using an approach called the Personal Health Train (PHT), which is a paradigm that utilises distributed analytics (DA) methods. The main principle of the PHT is that the analytical task is brought to the data provider and the data instances remain in their original location. Methods  In this work, we present our implementation of the PHT paradigm, which preserves the sovereignty and autonomy of the data providers and operates with a limited number of communication channels. We further conduct a DA use case on data stored in three different and distributed data providers. Results  We show that our infrastructure enables the training of data models based on distributed data sources. Conclusion  Our work presents the capabilities of DA infrastructures in the health care sector, which lower the regulatory obstacles of sharing patient data. We further demonstrate its ability to fuel medical science by making distributed data sets available for scientists or health care practitioners.
format Online
Article
Text
id pubmed-9246511
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Georg Thieme Verlag KG
record_format MEDLINE/PubMed
spelling pubmed-92465112022-07-01 A Privacy-Preserving Distributed Analytics Platform for Health Care Data Welten, Sascha Mou, Yongli Neumann, Laurenz Jaberansary, Mehrshad Yediel Ucer, Yeliz Kirsten, Toralf Decker, Stefan Beyan, Oya Methods Inf Med Background  In recent years, data-driven medicine has gained increasing importance in terms of diagnosis, treatment, and research due to the exponential growth of health care data. However, data protection regulations prohibit data centralisation for analysis purposes because of potential privacy risks like the accidental disclosure of data to third parties. Therefore, alternative data usage policies, which comply with present privacy guidelines, are of particular interest. Objective  We aim to enable analyses on sensitive patient data by simultaneously complying with local data protection regulations using an approach called the Personal Health Train (PHT), which is a paradigm that utilises distributed analytics (DA) methods. The main principle of the PHT is that the analytical task is brought to the data provider and the data instances remain in their original location. Methods  In this work, we present our implementation of the PHT paradigm, which preserves the sovereignty and autonomy of the data providers and operates with a limited number of communication channels. We further conduct a DA use case on data stored in three different and distributed data providers. Results  We show that our infrastructure enables the training of data models based on distributed data sources. Conclusion  Our work presents the capabilities of DA infrastructures in the health care sector, which lower the regulatory obstacles of sharing patient data. We further demonstrate its ability to fuel medical science by making distributed data sets available for scientists or health care practitioners. Georg Thieme Verlag KG 2022-01-17 /pmc/articles/PMC9246511/ /pubmed/35038764 http://dx.doi.org/10.1055/s-0041-1740564 Text en The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. ( https://creativecommons.org/licenses/by-nc-nd/4.0/ ) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License, which permits unrestricted reproduction and distribution, for non-commercial purposes only; and use and reproduction, but not distribution, of adapted material for non-commercial purposes only, provided the original work is properly cited.
spellingShingle Welten, Sascha
Mou, Yongli
Neumann, Laurenz
Jaberansary, Mehrshad
Yediel Ucer, Yeliz
Kirsten, Toralf
Decker, Stefan
Beyan, Oya
A Privacy-Preserving Distributed Analytics Platform for Health Care Data
title A Privacy-Preserving Distributed Analytics Platform for Health Care Data
title_full A Privacy-Preserving Distributed Analytics Platform for Health Care Data
title_fullStr A Privacy-Preserving Distributed Analytics Platform for Health Care Data
title_full_unstemmed A Privacy-Preserving Distributed Analytics Platform for Health Care Data
title_short A Privacy-Preserving Distributed Analytics Platform for Health Care Data
title_sort privacy-preserving distributed analytics platform for health care data
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9246511/
https://www.ncbi.nlm.nih.gov/pubmed/35038764
http://dx.doi.org/10.1055/s-0041-1740564
work_keys_str_mv AT weltensascha aprivacypreservingdistributedanalyticsplatformforhealthcaredata
AT mouyongli aprivacypreservingdistributedanalyticsplatformforhealthcaredata
AT neumannlaurenz aprivacypreservingdistributedanalyticsplatformforhealthcaredata
AT jaberansarymehrshad aprivacypreservingdistributedanalyticsplatformforhealthcaredata
AT yedieluceryeliz aprivacypreservingdistributedanalyticsplatformforhealthcaredata
AT kirstentoralf aprivacypreservingdistributedanalyticsplatformforhealthcaredata
AT deckerstefan aprivacypreservingdistributedanalyticsplatformforhealthcaredata
AT beyanoya aprivacypreservingdistributedanalyticsplatformforhealthcaredata
AT weltensascha privacypreservingdistributedanalyticsplatformforhealthcaredata
AT mouyongli privacypreservingdistributedanalyticsplatformforhealthcaredata
AT neumannlaurenz privacypreservingdistributedanalyticsplatformforhealthcaredata
AT jaberansarymehrshad privacypreservingdistributedanalyticsplatformforhealthcaredata
AT yedieluceryeliz privacypreservingdistributedanalyticsplatformforhealthcaredata
AT kirstentoralf privacypreservingdistributedanalyticsplatformforhealthcaredata
AT deckerstefan privacypreservingdistributedanalyticsplatformforhealthcaredata
AT beyanoya privacypreservingdistributedanalyticsplatformforhealthcaredata