Cargando…

Privacy preserving linkage using multiple match-keys

INTRODUCTION: Available and practical methods for privacy preserving linkage have shortcomings: methods utilising anonymous linkage codes provide limited accuracy while methods based on Bloom filters have proven vulnerable to frequency-based attacks. OBJECTIVES: In this paper, we present and evaluat...

Descripción completa

Detalles Bibliográficos
Autores principales: Randall, SM, Brown, AP, Ferrante, AM, Boyd, JH
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Swansea University 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7482515/
https://www.ncbi.nlm.nih.gov/pubmed/32935028
http://dx.doi.org/10.23889/ijpds.v4i1.1094
Descripción
Sumario:INTRODUCTION: Available and practical methods for privacy preserving linkage have shortcomings: methods utilising anonymous linkage codes provide limited accuracy while methods based on Bloom filters have proven vulnerable to frequency-based attacks. OBJECTIVES: In this paper, we present and evaluate a novel protocol that aims to meld both the accuracy of the Bloom filter method with the privacy achievable through the anonymous linkage code methodology. METHODS: The protocol involves creating multiple match-keys for each record, with the composition of each match-key depending on attributes of the underlying datasets being compared. The protocol was evaluated through de-duplication of four administrative datasets and two synthetic datasets; the ‘answers’ outlining which records belonged to the same individual were known for each dataset. The results were compared against results achieved with un-encoded linkage and other privacy preserving techniques on the same datasets. RESULTS: The multiple match-key protocol presented here achieved high quality across all datasets, performing better than record-level Bloom filters and the SLK, but worse than field-level Bloom filters. CONCLUSION: The presented method provides high linkage quality while avoiding the frequency based attacks that have been demonstrated against the Bloom filter approach. The method appears promising for real world use.