Cargando…

Privacy-preserving data sharing infrastructures for medical research: systematization and comparison

BACKGROUND: Data sharing is considered a crucial part of modern medical research. Unfortunately, despite its advantages, it often faces obstacles, especially data privacy challenges. As a result, various approaches and infrastructures have been developed that aim to ensure that patients and research...

Descripción completa

Detalles Bibliográficos
Autores principales: Wirth, Felix Nikolaus, Meurers, Thierry, Johns, Marco, Prasser, Fabian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8359765/
https://www.ncbi.nlm.nih.gov/pubmed/34384406
http://dx.doi.org/10.1186/s12911-021-01602-x
_version_ 1783737604990042112
author Wirth, Felix Nikolaus
Meurers, Thierry
Johns, Marco
Prasser, Fabian
author_facet Wirth, Felix Nikolaus
Meurers, Thierry
Johns, Marco
Prasser, Fabian
author_sort Wirth, Felix Nikolaus
collection PubMed
description BACKGROUND: Data sharing is considered a crucial part of modern medical research. Unfortunately, despite its advantages, it often faces obstacles, especially data privacy challenges. As a result, various approaches and infrastructures have been developed that aim to ensure that patients and research participants remain anonymous when data is shared. However, privacy protection typically comes at a cost, e.g. restrictions regarding the types of analyses that can be performed on shared data. What is lacking is a systematization making the trade-offs taken by different approaches transparent. The aim of the work described in this paper was to develop a systematization for the degree of privacy protection provided and the trade-offs taken by different data sharing methods. Based on this contribution, we categorized popular data sharing approaches and identified research gaps by analyzing combinations of promising properties and features that are not yet supported by existing approaches. METHODS: The systematization consists of different axes. Three axes relate to privacy protection aspects and were adopted from the popular Five Safes Framework: (1) safe data, addressing privacy at the input level, (2) safe settings, addressing privacy during shared processing, and (3) safe outputs, addressing privacy protection of analysis results. Three additional axes address the usefulness of approaches: (4) support for de-duplication, to enable the reconciliation of data belonging to the same individuals, (5) flexibility, to be able to adapt to different data analysis requirements, and (6) scalability, to maintain performance with increasing complexity of shared data or common analysis processes. RESULTS: Using the systematization, we identified three different categories of approaches: distributed data analyses, which exchange anonymous aggregated data, secure multi-party computation protocols, which exchange encrypted data, and data enclaves, which store pooled individual-level data in secure environments for access for analysis purposes. We identified important research gaps, including a lack of approaches enabling the de-duplication of horizontally distributed data or providing a high degree of flexibility. CONCLUSIONS: There are fundamental differences between different data sharing approaches and several gaps in their functionality that may be interesting to investigate in future work. Our systematization can make the properties of privacy-preserving data sharing infrastructures more transparent and support decision makers and regulatory authorities with a better understanding of the trade-offs taken.
format Online
Article
Text
id pubmed-8359765
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-83597652021-08-13 Privacy-preserving data sharing infrastructures for medical research: systematization and comparison Wirth, Felix Nikolaus Meurers, Thierry Johns, Marco Prasser, Fabian BMC Med Inform Decis Mak Research BACKGROUND: Data sharing is considered a crucial part of modern medical research. Unfortunately, despite its advantages, it often faces obstacles, especially data privacy challenges. As a result, various approaches and infrastructures have been developed that aim to ensure that patients and research participants remain anonymous when data is shared. However, privacy protection typically comes at a cost, e.g. restrictions regarding the types of analyses that can be performed on shared data. What is lacking is a systematization making the trade-offs taken by different approaches transparent. The aim of the work described in this paper was to develop a systematization for the degree of privacy protection provided and the trade-offs taken by different data sharing methods. Based on this contribution, we categorized popular data sharing approaches and identified research gaps by analyzing combinations of promising properties and features that are not yet supported by existing approaches. METHODS: The systematization consists of different axes. Three axes relate to privacy protection aspects and were adopted from the popular Five Safes Framework: (1) safe data, addressing privacy at the input level, (2) safe settings, addressing privacy during shared processing, and (3) safe outputs, addressing privacy protection of analysis results. Three additional axes address the usefulness of approaches: (4) support for de-duplication, to enable the reconciliation of data belonging to the same individuals, (5) flexibility, to be able to adapt to different data analysis requirements, and (6) scalability, to maintain performance with increasing complexity of shared data or common analysis processes. RESULTS: Using the systematization, we identified three different categories of approaches: distributed data analyses, which exchange anonymous aggregated data, secure multi-party computation protocols, which exchange encrypted data, and data enclaves, which store pooled individual-level data in secure environments for access for analysis purposes. We identified important research gaps, including a lack of approaches enabling the de-duplication of horizontally distributed data or providing a high degree of flexibility. CONCLUSIONS: There are fundamental differences between different data sharing approaches and several gaps in their functionality that may be interesting to investigate in future work. Our systematization can make the properties of privacy-preserving data sharing infrastructures more transparent and support decision makers and regulatory authorities with a better understanding of the trade-offs taken. BioMed Central 2021-08-12 /pmc/articles/PMC8359765/ /pubmed/34384406 http://dx.doi.org/10.1186/s12911-021-01602-x Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Wirth, Felix Nikolaus
Meurers, Thierry
Johns, Marco
Prasser, Fabian
Privacy-preserving data sharing infrastructures for medical research: systematization and comparison
title Privacy-preserving data sharing infrastructures for medical research: systematization and comparison
title_full Privacy-preserving data sharing infrastructures for medical research: systematization and comparison
title_fullStr Privacy-preserving data sharing infrastructures for medical research: systematization and comparison
title_full_unstemmed Privacy-preserving data sharing infrastructures for medical research: systematization and comparison
title_short Privacy-preserving data sharing infrastructures for medical research: systematization and comparison
title_sort privacy-preserving data sharing infrastructures for medical research: systematization and comparison
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8359765/
https://www.ncbi.nlm.nih.gov/pubmed/34384406
http://dx.doi.org/10.1186/s12911-021-01602-x
work_keys_str_mv AT wirthfelixnikolaus privacypreservingdatasharinginfrastructuresformedicalresearchsystematizationandcomparison
AT meurersthierry privacypreservingdatasharinginfrastructuresformedicalresearchsystematizationandcomparison
AT johnsmarco privacypreservingdatasharinginfrastructuresformedicalresearchsystematizationandcomparison
AT prasserfabian privacypreservingdatasharinginfrastructuresformedicalresearchsystematizationandcomparison