Cargando…
A Privacy-Preserving Distributed Analytics Platform for Health Care Data
Background In recent years, data-driven medicine has gained increasing importance in terms of diagnosis, treatment, and research due to the exponential growth of health care data. However, data protection regulations prohibit data centralisation for analysis purposes because of potential privacy ri...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Georg Thieme Verlag KG
2022
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9246511/ https://www.ncbi.nlm.nih.gov/pubmed/35038764 http://dx.doi.org/10.1055/s-0041-1740564 |
_version_ | 1784738984876834816 |
---|---|
author | Welten, Sascha Mou, Yongli Neumann, Laurenz Jaberansary, Mehrshad Yediel Ucer, Yeliz Kirsten, Toralf Decker, Stefan Beyan, Oya |
author_facet | Welten, Sascha Mou, Yongli Neumann, Laurenz Jaberansary, Mehrshad Yediel Ucer, Yeliz Kirsten, Toralf Decker, Stefan Beyan, Oya |
author_sort | Welten, Sascha |
collection | PubMed |
description | Background In recent years, data-driven medicine has gained increasing importance in terms of diagnosis, treatment, and research due to the exponential growth of health care data. However, data protection regulations prohibit data centralisation for analysis purposes because of potential privacy risks like the accidental disclosure of data to third parties. Therefore, alternative data usage policies, which comply with present privacy guidelines, are of particular interest. Objective We aim to enable analyses on sensitive patient data by simultaneously complying with local data protection regulations using an approach called the Personal Health Train (PHT), which is a paradigm that utilises distributed analytics (DA) methods. The main principle of the PHT is that the analytical task is brought to the data provider and the data instances remain in their original location. Methods In this work, we present our implementation of the PHT paradigm, which preserves the sovereignty and autonomy of the data providers and operates with a limited number of communication channels. We further conduct a DA use case on data stored in three different and distributed data providers. Results We show that our infrastructure enables the training of data models based on distributed data sources. Conclusion Our work presents the capabilities of DA infrastructures in the health care sector, which lower the regulatory obstacles of sharing patient data. We further demonstrate its ability to fuel medical science by making distributed data sets available for scientists or health care practitioners. |
format | Online Article Text |
id | pubmed-9246511 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Georg Thieme Verlag KG |
record_format | MEDLINE/PubMed |
spelling | pubmed-92465112022-07-01 A Privacy-Preserving Distributed Analytics Platform for Health Care Data Welten, Sascha Mou, Yongli Neumann, Laurenz Jaberansary, Mehrshad Yediel Ucer, Yeliz Kirsten, Toralf Decker, Stefan Beyan, Oya Methods Inf Med Background In recent years, data-driven medicine has gained increasing importance in terms of diagnosis, treatment, and research due to the exponential growth of health care data. However, data protection regulations prohibit data centralisation for analysis purposes because of potential privacy risks like the accidental disclosure of data to third parties. Therefore, alternative data usage policies, which comply with present privacy guidelines, are of particular interest. Objective We aim to enable analyses on sensitive patient data by simultaneously complying with local data protection regulations using an approach called the Personal Health Train (PHT), which is a paradigm that utilises distributed analytics (DA) methods. The main principle of the PHT is that the analytical task is brought to the data provider and the data instances remain in their original location. Methods In this work, we present our implementation of the PHT paradigm, which preserves the sovereignty and autonomy of the data providers and operates with a limited number of communication channels. We further conduct a DA use case on data stored in three different and distributed data providers. Results We show that our infrastructure enables the training of data models based on distributed data sources. Conclusion Our work presents the capabilities of DA infrastructures in the health care sector, which lower the regulatory obstacles of sharing patient data. We further demonstrate its ability to fuel medical science by making distributed data sets available for scientists or health care practitioners. Georg Thieme Verlag KG 2022-01-17 /pmc/articles/PMC9246511/ /pubmed/35038764 http://dx.doi.org/10.1055/s-0041-1740564 Text en The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. ( https://creativecommons.org/licenses/by-nc-nd/4.0/ ) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License, which permits unrestricted reproduction and distribution, for non-commercial purposes only; and use and reproduction, but not distribution, of adapted material for non-commercial purposes only, provided the original work is properly cited. |
spellingShingle | Welten, Sascha Mou, Yongli Neumann, Laurenz Jaberansary, Mehrshad Yediel Ucer, Yeliz Kirsten, Toralf Decker, Stefan Beyan, Oya A Privacy-Preserving Distributed Analytics Platform for Health Care Data |
title | A Privacy-Preserving Distributed Analytics Platform for Health Care Data |
title_full | A Privacy-Preserving Distributed Analytics Platform for Health Care Data |
title_fullStr | A Privacy-Preserving Distributed Analytics Platform for Health Care Data |
title_full_unstemmed | A Privacy-Preserving Distributed Analytics Platform for Health Care Data |
title_short | A Privacy-Preserving Distributed Analytics Platform for Health Care Data |
title_sort | privacy-preserving distributed analytics platform for health care data |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9246511/ https://www.ncbi.nlm.nih.gov/pubmed/35038764 http://dx.doi.org/10.1055/s-0041-1740564 |
work_keys_str_mv | AT weltensascha aprivacypreservingdistributedanalyticsplatformforhealthcaredata AT mouyongli aprivacypreservingdistributedanalyticsplatformforhealthcaredata AT neumannlaurenz aprivacypreservingdistributedanalyticsplatformforhealthcaredata AT jaberansarymehrshad aprivacypreservingdistributedanalyticsplatformforhealthcaredata AT yedieluceryeliz aprivacypreservingdistributedanalyticsplatformforhealthcaredata AT kirstentoralf aprivacypreservingdistributedanalyticsplatformforhealthcaredata AT deckerstefan aprivacypreservingdistributedanalyticsplatformforhealthcaredata AT beyanoya aprivacypreservingdistributedanalyticsplatformforhealthcaredata AT weltensascha privacypreservingdistributedanalyticsplatformforhealthcaredata AT mouyongli privacypreservingdistributedanalyticsplatformforhealthcaredata AT neumannlaurenz privacypreservingdistributedanalyticsplatformforhealthcaredata AT jaberansarymehrshad privacypreservingdistributedanalyticsplatformforhealthcaredata AT yedieluceryeliz privacypreservingdistributedanalyticsplatformforhealthcaredata AT kirstentoralf privacypreservingdistributedanalyticsplatformforhealthcaredata AT deckerstefan privacypreservingdistributedanalyticsplatformforhealthcaredata AT beyanoya privacypreservingdistributedanalyticsplatformforhealthcaredata |