Cargando…

Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation

BACKGROUND: As one of the several effective solutions for personal privacy protection, a global unique identifier (GUID) is linked with hash codes that are generated from combinations of personally identifiable information (PII) by a one-way hash algorithm. On the GUID server, no PII is permitted to...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Xianlai, Fann, Yang C, McAuliffe, Matthew, Vismer, David, Yang, Rong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5336604/
https://www.ncbi.nlm.nih.gov/pubmed/28213343
http://dx.doi.org/10.2196/medinform.5054
_version_ 1782512221190881280
author Chen, Xianlai
Fann, Yang C
McAuliffe, Matthew
Vismer, David
Yang, Rong
author_facet Chen, Xianlai
Fann, Yang C
McAuliffe, Matthew
Vismer, David
Yang, Rong
author_sort Chen, Xianlai
collection PubMed
description BACKGROUND: As one of the several effective solutions for personal privacy protection, a global unique identifier (GUID) is linked with hash codes that are generated from combinations of personally identifiable information (PII) by a one-way hash algorithm. On the GUID server, no PII is permitted to be stored, and only GUID and hash codes are allowed. The quality of PII entry is critical to the GUID system. OBJECTIVE: The goal of our study was to explore a method of checking questionable entry of PII in this context without using or sending any portion of PII while registering a subject. METHODS: According to the principle of GUID system, all possible combination patterns of PII fields were analyzed and used to generate hash codes, which were stored on the GUID server. Based on the matching rules of the GUID system, an error-checking algorithm was developed using set theory to check PII entry errors. We selected 200,000 simulated individuals with randomly-planted errors to evaluate the proposed algorithm. These errors were placed in the required PII fields or optional PII fields. The performance of the proposed algorithm was also tested in the registering system of study subjects. RESULTS: There are 127,700 error-planted subjects, of which 114,464 (89.64%) can still be identified as the previous one and remaining 13,236 (10.36%, 13,236/127,700) are discriminated as new subjects. As expected, 100% of nonidentified subjects had errors within the required PII fields. The possibility that a subject is identified is related to the count and the type of incorrect PII field. For all identified subjects, their errors can be found by the proposed algorithm. The scope of questionable PII fields is also associated with the count and the type of the incorrect PII field. The best situation is to precisely find the exact incorrect PII fields, and the worst situation is to shrink the questionable scope only to a set of 13 PII fields. In the application, the proposed algorithm can give a hint of questionable PII entry and perform as an effective tool. CONCLUSIONS: The GUID system has high error tolerance and may correctly identify and associate a subject even with few PII field errors. Correct data entry, especially required PII fields, is critical to avoiding false splits. In the context of one-way hash transformation, the questionable input of PII may be identified by applying set theory operators based on the hash codes. The count and the type of incorrect PII fields play an important role in identifying a subject and locating questionable PII fields.
format Online
Article
Text
id pubmed-5336604
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-53366042017-03-20 Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation Chen, Xianlai Fann, Yang C McAuliffe, Matthew Vismer, David Yang, Rong JMIR Med Inform Original Paper BACKGROUND: As one of the several effective solutions for personal privacy protection, a global unique identifier (GUID) is linked with hash codes that are generated from combinations of personally identifiable information (PII) by a one-way hash algorithm. On the GUID server, no PII is permitted to be stored, and only GUID and hash codes are allowed. The quality of PII entry is critical to the GUID system. OBJECTIVE: The goal of our study was to explore a method of checking questionable entry of PII in this context without using or sending any portion of PII while registering a subject. METHODS: According to the principle of GUID system, all possible combination patterns of PII fields were analyzed and used to generate hash codes, which were stored on the GUID server. Based on the matching rules of the GUID system, an error-checking algorithm was developed using set theory to check PII entry errors. We selected 200,000 simulated individuals with randomly-planted errors to evaluate the proposed algorithm. These errors were placed in the required PII fields or optional PII fields. The performance of the proposed algorithm was also tested in the registering system of study subjects. RESULTS: There are 127,700 error-planted subjects, of which 114,464 (89.64%) can still be identified as the previous one and remaining 13,236 (10.36%, 13,236/127,700) are discriminated as new subjects. As expected, 100% of nonidentified subjects had errors within the required PII fields. The possibility that a subject is identified is related to the count and the type of incorrect PII field. For all identified subjects, their errors can be found by the proposed algorithm. The scope of questionable PII fields is also associated with the count and the type of the incorrect PII field. The best situation is to precisely find the exact incorrect PII fields, and the worst situation is to shrink the questionable scope only to a set of 13 PII fields. In the application, the proposed algorithm can give a hint of questionable PII entry and perform as an effective tool. CONCLUSIONS: The GUID system has high error tolerance and may correctly identify and associate a subject even with few PII field errors. Correct data entry, especially required PII fields, is critical to avoiding false splits. In the context of one-way hash transformation, the questionable input of PII may be identified by applying set theory operators based on the hash codes. The count and the type of incorrect PII fields play an important role in identifying a subject and locating questionable PII fields. JMIR Publications 2017-02-17 /pmc/articles/PMC5336604/ /pubmed/28213343 http://dx.doi.org/10.2196/medinform.5054 Text en ©Xianlai Chen, Yang C Fann, Matthew McAuliffe, David Vismer, Rong Yang. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 17.02.2017. https://creativecommons.org/licenses/by/2.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/ (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Chen, Xianlai
Fann, Yang C
McAuliffe, Matthew
Vismer, David
Yang, Rong
Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation
title Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation
title_full Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation
title_fullStr Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation
title_full_unstemmed Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation
title_short Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation
title_sort checking questionable entry of personally identifiable information encrypted by one-way hash transformation
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5336604/
https://www.ncbi.nlm.nih.gov/pubmed/28213343
http://dx.doi.org/10.2196/medinform.5054
work_keys_str_mv AT chenxianlai checkingquestionableentryofpersonallyidentifiableinformationencryptedbyonewayhashtransformation
AT fannyangc checkingquestionableentryofpersonallyidentifiableinformationencryptedbyonewayhashtransformation
AT mcauliffematthew checkingquestionableentryofpersonallyidentifiableinformationencryptedbyonewayhashtransformation
AT vismerdavid checkingquestionableentryofpersonallyidentifiableinformationencryptedbyonewayhashtransformation
AT yangrong checkingquestionableentryofpersonallyidentifiableinformationencryptedbyonewayhashtransformation