Cargando…

A blinded evaluation of privacy preserving record linkage with Bloom filters

BACKGROUND: Privacy preserving record linkage (PPRL) methods using Bloom filters have shown promise for use in operational linkage settings. However real-world evaluations are required to confirm their suitability in practice. METHODS: An extract of records from the Western Australian (WA) Hospital...

Descripción completa

Detalles Bibliográficos
Autores principales: Randall, Sean, Wichmann, Helen, Brown, Adrian, Boyd, James, Eitelhuber, Tom, Merchant, Alexandra, Ferrante, Anna
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8761329/
https://www.ncbi.nlm.nih.gov/pubmed/35034615
http://dx.doi.org/10.1186/s12874-022-01510-2
_version_ 1784633504815906816
author Randall, Sean
Wichmann, Helen
Brown, Adrian
Boyd, James
Eitelhuber, Tom
Merchant, Alexandra
Ferrante, Anna
author_facet Randall, Sean
Wichmann, Helen
Brown, Adrian
Boyd, James
Eitelhuber, Tom
Merchant, Alexandra
Ferrante, Anna
author_sort Randall, Sean
collection PubMed
description BACKGROUND: Privacy preserving record linkage (PPRL) methods using Bloom filters have shown promise for use in operational linkage settings. However real-world evaluations are required to confirm their suitability in practice. METHODS: An extract of records from the Western Australian (WA) Hospital Morbidity Data Collection 2011–2015 and WA Death Registrations 2011–2015 were encoded to Bloom filters, and then linked using privacy-preserving methods. Results were compared to a traditional, un-encoded linkage of the same datasets using the same blocking criteria to enable direct investigation of the comparison step. The encoded linkage was carried out in a blinded setting, where there was no access to un-encoded data or a ‘truth set’. RESULTS: The PPRL method using Bloom filters provided similar linkage quality to the traditional un-encoded linkage, with 99.3% of ‘groupings’ identical between privacy preserving and clear-text linkage. CONCLUSION: The Bloom filter method appears suitable for use in situations where clear-text identifiers cannot be provided for linkage. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12874-022-01510-2.
format Online
Article
Text
id pubmed-8761329
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-87613292022-01-18 A blinded evaluation of privacy preserving record linkage with Bloom filters Randall, Sean Wichmann, Helen Brown, Adrian Boyd, James Eitelhuber, Tom Merchant, Alexandra Ferrante, Anna BMC Med Res Methodol Research BACKGROUND: Privacy preserving record linkage (PPRL) methods using Bloom filters have shown promise for use in operational linkage settings. However real-world evaluations are required to confirm their suitability in practice. METHODS: An extract of records from the Western Australian (WA) Hospital Morbidity Data Collection 2011–2015 and WA Death Registrations 2011–2015 were encoded to Bloom filters, and then linked using privacy-preserving methods. Results were compared to a traditional, un-encoded linkage of the same datasets using the same blocking criteria to enable direct investigation of the comparison step. The encoded linkage was carried out in a blinded setting, where there was no access to un-encoded data or a ‘truth set’. RESULTS: The PPRL method using Bloom filters provided similar linkage quality to the traditional un-encoded linkage, with 99.3% of ‘groupings’ identical between privacy preserving and clear-text linkage. CONCLUSION: The Bloom filter method appears suitable for use in situations where clear-text identifiers cannot be provided for linkage. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12874-022-01510-2. BioMed Central 2022-01-16 /pmc/articles/PMC8761329/ /pubmed/35034615 http://dx.doi.org/10.1186/s12874-022-01510-2 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Randall, Sean
Wichmann, Helen
Brown, Adrian
Boyd, James
Eitelhuber, Tom
Merchant, Alexandra
Ferrante, Anna
A blinded evaluation of privacy preserving record linkage with Bloom filters
title A blinded evaluation of privacy preserving record linkage with Bloom filters
title_full A blinded evaluation of privacy preserving record linkage with Bloom filters
title_fullStr A blinded evaluation of privacy preserving record linkage with Bloom filters
title_full_unstemmed A blinded evaluation of privacy preserving record linkage with Bloom filters
title_short A blinded evaluation of privacy preserving record linkage with Bloom filters
title_sort blinded evaluation of privacy preserving record linkage with bloom filters
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8761329/
https://www.ncbi.nlm.nih.gov/pubmed/35034615
http://dx.doi.org/10.1186/s12874-022-01510-2
work_keys_str_mv AT randallsean ablindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT wichmannhelen ablindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT brownadrian ablindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT boydjames ablindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT eitelhubertom ablindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT merchantalexandra ablindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT ferranteanna ablindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT randallsean blindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT wichmannhelen blindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT brownadrian blindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT boydjames blindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT eitelhubertom blindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT merchantalexandra blindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT ferranteanna blindedevaluationofprivacypreservingrecordlinkagewithbloomfilters