Cargando…

Threshold protocol for the exchange of confidential medical data

BACKGROUND: Medical researchers often need to share clinical data without violating patient confidentiality. Threshold cryptographic protocols divide messages into multiple pieces, no single piece containing information that can reconstruct the original message. The author describes and implements a...

Descripción completa

Detalles Bibliográficos
Autor principal: Berman, Jules J
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2002
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC165892/
https://www.ncbi.nlm.nih.gov/pubmed/12425722
http://dx.doi.org/10.1186/1471-2288-2-12
_version_ 1782120845894746112
author Berman, Jules J
author_facet Berman, Jules J
author_sort Berman, Jules J
collection PubMed
description BACKGROUND: Medical researchers often need to share clinical data without violating patient confidentiality. Threshold cryptographic protocols divide messages into multiple pieces, no single piece containing information that can reconstruct the original message. The author describes and implements a novel threshold protocol that can be used to search, annotate or transform confidential data without breaching patient confidentiality. METHODS: The basic threshold protocol is: 1) Text is divided into short phrases; 2) Each phrase is converted by a one-way hash algorithm into a seemingly-random set of characters; 3) Threshold Piece 1 is composed of the list of all phrases, with each phrase followed by its one-way hash; 4) Threshold Piece 2 is composed of the text with all phrases replaced by their one-way hash values, and with high-frequency words preserved. Neither Piece 1 nor Piece 2 contains information linking patients to their records. The original text can be re-constructed from Piece 1 and Piece 2. RESULTS: The threshold algorithm produces two files (threshold pieces). In typical usage, Piece 2 is held by the data owner, and Piece 1 is freely distributed. Piece 1 can be annotated and returned to the owner of the original data to enhance the complete data set. Collections of Piece 1 files can be merged and distributed without identifying patient records. Variations of the threshold protocol are described. The author's Perl implementation is freely available. CONCLUSIONS: Threshold files are safe in the sense that they are de-identified and can be used for research purposes. The threshold protocol is particularly useful when the receiver of the threshold file needs to obtain certain concepts or data-types found in the original data, but does not need to fully understand the original data set.
format Text
id pubmed-165892
institution National Center for Biotechnology Information
language English
publishDate 2002
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-1658922003-07-24 Threshold protocol for the exchange of confidential medical data Berman, Jules J BMC Med Res Methodol Research Article BACKGROUND: Medical researchers often need to share clinical data without violating patient confidentiality. Threshold cryptographic protocols divide messages into multiple pieces, no single piece containing information that can reconstruct the original message. The author describes and implements a novel threshold protocol that can be used to search, annotate or transform confidential data without breaching patient confidentiality. METHODS: The basic threshold protocol is: 1) Text is divided into short phrases; 2) Each phrase is converted by a one-way hash algorithm into a seemingly-random set of characters; 3) Threshold Piece 1 is composed of the list of all phrases, with each phrase followed by its one-way hash; 4) Threshold Piece 2 is composed of the text with all phrases replaced by their one-way hash values, and with high-frequency words preserved. Neither Piece 1 nor Piece 2 contains information linking patients to their records. The original text can be re-constructed from Piece 1 and Piece 2. RESULTS: The threshold algorithm produces two files (threshold pieces). In typical usage, Piece 2 is held by the data owner, and Piece 1 is freely distributed. Piece 1 can be annotated and returned to the owner of the original data to enhance the complete data set. Collections of Piece 1 files can be merged and distributed without identifying patient records. Variations of the threshold protocol are described. The author's Perl implementation is freely available. CONCLUSIONS: Threshold files are safe in the sense that they are de-identified and can be used for research purposes. The threshold protocol is particularly useful when the receiver of the threshold file needs to obtain certain concepts or data-types found in the original data, but does not need to fully understand the original data set. BioMed Central 2002-11-11 /pmc/articles/PMC165892/ /pubmed/12425722 http://dx.doi.org/10.1186/1471-2288-2-12 Text en This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle Research Article
Berman, Jules J
Threshold protocol for the exchange of confidential medical data
title Threshold protocol for the exchange of confidential medical data
title_full Threshold protocol for the exchange of confidential medical data
title_fullStr Threshold protocol for the exchange of confidential medical data
title_full_unstemmed Threshold protocol for the exchange of confidential medical data
title_short Threshold protocol for the exchange of confidential medical data
title_sort threshold protocol for the exchange of confidential medical data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC165892/
https://www.ncbi.nlm.nih.gov/pubmed/12425722
http://dx.doi.org/10.1186/1471-2288-2-12
work_keys_str_mv AT bermanjulesj thresholdprotocolfortheexchangeofconfidentialmedicaldata