Cargando…

Privacy-preserving record linkage in large databases using secure multiparty computation

BACKGROUND: Practical applications for data analysis may require combining multiple databases belonging to different owners, such as health centers. The analysis should be performed without violating privacy of neither the centers themselves, nor the patients whose records these centers store. To av...

Descripción completa

Detalles Bibliográficos
Autores principales: Laud, Peeter, Pankova, Alisa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6180364/
https://www.ncbi.nlm.nih.gov/pubmed/30309353
http://dx.doi.org/10.1186/s12920-018-0400-8
_version_ 1783362185050718208
author Laud, Peeter
Pankova, Alisa
author_facet Laud, Peeter
Pankova, Alisa
author_sort Laud, Peeter
collection PubMed
description BACKGROUND: Practical applications for data analysis may require combining multiple databases belonging to different owners, such as health centers. The analysis should be performed without violating privacy of neither the centers themselves, nor the patients whose records these centers store. To avoid biased analysis results, it may be important to remove duplicate records among the centers, so that each patient’s data would be taken into account only once. This task is very closely related to privacy-preserving record linkage. METHODS: This paper presents a solution to privacy-preserving deduplication among records of several databases using secure multiparty computation. It is build upon one of the fastest practical secure multiparty computation platforms, called Sharemind. RESULTS: The tests on ca 10 million records of simulated databases with 1000 health centers of 10000 records each show that the computation is feasible in practice. The expected running time of the experiment is ca. 30 min for computing servers connected over 100 Mbit/s WAN, the expected error of the results is 2(−40), and no errors have been detected for the particular test set that we used for our benchmarks. CONCLUSIONS: The solution is ready for practical use. It has well-defined security properties, implied by the properties of Sharemind platform. The solution assumes that exact matching of records is required, and a possible future research would be extending it to approximate matching.
format Online
Article
Text
id pubmed-6180364
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-61803642018-10-18 Privacy-preserving record linkage in large databases using secure multiparty computation Laud, Peeter Pankova, Alisa BMC Med Genomics Technical Advance BACKGROUND: Practical applications for data analysis may require combining multiple databases belonging to different owners, such as health centers. The analysis should be performed without violating privacy of neither the centers themselves, nor the patients whose records these centers store. To avoid biased analysis results, it may be important to remove duplicate records among the centers, so that each patient’s data would be taken into account only once. This task is very closely related to privacy-preserving record linkage. METHODS: This paper presents a solution to privacy-preserving deduplication among records of several databases using secure multiparty computation. It is build upon one of the fastest practical secure multiparty computation platforms, called Sharemind. RESULTS: The tests on ca 10 million records of simulated databases with 1000 health centers of 10000 records each show that the computation is feasible in practice. The expected running time of the experiment is ca. 30 min for computing servers connected over 100 Mbit/s WAN, the expected error of the results is 2(−40), and no errors have been detected for the particular test set that we used for our benchmarks. CONCLUSIONS: The solution is ready for practical use. It has well-defined security properties, implied by the properties of Sharemind platform. The solution assumes that exact matching of records is required, and a possible future research would be extending it to approximate matching. BioMed Central 2018-10-11 /pmc/articles/PMC6180364/ /pubmed/30309353 http://dx.doi.org/10.1186/s12920-018-0400-8 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Technical Advance
Laud, Peeter
Pankova, Alisa
Privacy-preserving record linkage in large databases using secure multiparty computation
title Privacy-preserving record linkage in large databases using secure multiparty computation
title_full Privacy-preserving record linkage in large databases using secure multiparty computation
title_fullStr Privacy-preserving record linkage in large databases using secure multiparty computation
title_full_unstemmed Privacy-preserving record linkage in large databases using secure multiparty computation
title_short Privacy-preserving record linkage in large databases using secure multiparty computation
title_sort privacy-preserving record linkage in large databases using secure multiparty computation
topic Technical Advance
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6180364/
https://www.ncbi.nlm.nih.gov/pubmed/30309353
http://dx.doi.org/10.1186/s12920-018-0400-8
work_keys_str_mv AT laudpeeter privacypreservingrecordlinkageinlargedatabasesusingsecuremultipartycomputation
AT pankovaalisa privacypreservingrecordlinkageinlargedatabasesusingsecuremultipartycomputation