Cargando…
Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation
BACKGROUND: As one of the several effective solutions for personal privacy protection, a global unique identifier (GUID) is linked with hash codes that are generated from combinations of personally identifiable information (PII) by a one-way hash algorithm. On the GUID server, no PII is permitted to...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5336604/ https://www.ncbi.nlm.nih.gov/pubmed/28213343 http://dx.doi.org/10.2196/medinform.5054 |
_version_ | 1782512221190881280 |
---|---|
author | Chen, Xianlai Fann, Yang C McAuliffe, Matthew Vismer, David Yang, Rong |
author_facet | Chen, Xianlai Fann, Yang C McAuliffe, Matthew Vismer, David Yang, Rong |
author_sort | Chen, Xianlai |
collection | PubMed |
description | BACKGROUND: As one of the several effective solutions for personal privacy protection, a global unique identifier (GUID) is linked with hash codes that are generated from combinations of personally identifiable information (PII) by a one-way hash algorithm. On the GUID server, no PII is permitted to be stored, and only GUID and hash codes are allowed. The quality of PII entry is critical to the GUID system. OBJECTIVE: The goal of our study was to explore a method of checking questionable entry of PII in this context without using or sending any portion of PII while registering a subject. METHODS: According to the principle of GUID system, all possible combination patterns of PII fields were analyzed and used to generate hash codes, which were stored on the GUID server. Based on the matching rules of the GUID system, an error-checking algorithm was developed using set theory to check PII entry errors. We selected 200,000 simulated individuals with randomly-planted errors to evaluate the proposed algorithm. These errors were placed in the required PII fields or optional PII fields. The performance of the proposed algorithm was also tested in the registering system of study subjects. RESULTS: There are 127,700 error-planted subjects, of which 114,464 (89.64%) can still be identified as the previous one and remaining 13,236 (10.36%, 13,236/127,700) are discriminated as new subjects. As expected, 100% of nonidentified subjects had errors within the required PII fields. The possibility that a subject is identified is related to the count and the type of incorrect PII field. For all identified subjects, their errors can be found by the proposed algorithm. The scope of questionable PII fields is also associated with the count and the type of the incorrect PII field. The best situation is to precisely find the exact incorrect PII fields, and the worst situation is to shrink the questionable scope only to a set of 13 PII fields. In the application, the proposed algorithm can give a hint of questionable PII entry and perform as an effective tool. CONCLUSIONS: The GUID system has high error tolerance and may correctly identify and associate a subject even with few PII field errors. Correct data entry, especially required PII fields, is critical to avoiding false splits. In the context of one-way hash transformation, the questionable input of PII may be identified by applying set theory operators based on the hash codes. The count and the type of incorrect PII fields play an important role in identifying a subject and locating questionable PII fields. |
format | Online Article Text |
id | pubmed-5336604 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-53366042017-03-20 Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation Chen, Xianlai Fann, Yang C McAuliffe, Matthew Vismer, David Yang, Rong JMIR Med Inform Original Paper BACKGROUND: As one of the several effective solutions for personal privacy protection, a global unique identifier (GUID) is linked with hash codes that are generated from combinations of personally identifiable information (PII) by a one-way hash algorithm. On the GUID server, no PII is permitted to be stored, and only GUID and hash codes are allowed. The quality of PII entry is critical to the GUID system. OBJECTIVE: The goal of our study was to explore a method of checking questionable entry of PII in this context without using or sending any portion of PII while registering a subject. METHODS: According to the principle of GUID system, all possible combination patterns of PII fields were analyzed and used to generate hash codes, which were stored on the GUID server. Based on the matching rules of the GUID system, an error-checking algorithm was developed using set theory to check PII entry errors. We selected 200,000 simulated individuals with randomly-planted errors to evaluate the proposed algorithm. These errors were placed in the required PII fields or optional PII fields. The performance of the proposed algorithm was also tested in the registering system of study subjects. RESULTS: There are 127,700 error-planted subjects, of which 114,464 (89.64%) can still be identified as the previous one and remaining 13,236 (10.36%, 13,236/127,700) are discriminated as new subjects. As expected, 100% of nonidentified subjects had errors within the required PII fields. The possibility that a subject is identified is related to the count and the type of incorrect PII field. For all identified subjects, their errors can be found by the proposed algorithm. The scope of questionable PII fields is also associated with the count and the type of the incorrect PII field. The best situation is to precisely find the exact incorrect PII fields, and the worst situation is to shrink the questionable scope only to a set of 13 PII fields. In the application, the proposed algorithm can give a hint of questionable PII entry and perform as an effective tool. CONCLUSIONS: The GUID system has high error tolerance and may correctly identify and associate a subject even with few PII field errors. Correct data entry, especially required PII fields, is critical to avoiding false splits. In the context of one-way hash transformation, the questionable input of PII may be identified by applying set theory operators based on the hash codes. The count and the type of incorrect PII fields play an important role in identifying a subject and locating questionable PII fields. JMIR Publications 2017-02-17 /pmc/articles/PMC5336604/ /pubmed/28213343 http://dx.doi.org/10.2196/medinform.5054 Text en ©Xianlai Chen, Yang C Fann, Matthew McAuliffe, David Vismer, Rong Yang. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 17.02.2017. https://creativecommons.org/licenses/by/2.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/ (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Chen, Xianlai Fann, Yang C McAuliffe, Matthew Vismer, David Yang, Rong Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation |
title | Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation |
title_full | Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation |
title_fullStr | Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation |
title_full_unstemmed | Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation |
title_short | Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation |
title_sort | checking questionable entry of personally identifiable information encrypted by one-way hash transformation |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5336604/ https://www.ncbi.nlm.nih.gov/pubmed/28213343 http://dx.doi.org/10.2196/medinform.5054 |
work_keys_str_mv | AT chenxianlai checkingquestionableentryofpersonallyidentifiableinformationencryptedbyonewayhashtransformation AT fannyangc checkingquestionableentryofpersonallyidentifiableinformationencryptedbyonewayhashtransformation AT mcauliffematthew checkingquestionableentryofpersonallyidentifiableinformationencryptedbyonewayhashtransformation AT vismerdavid checkingquestionableentryofpersonallyidentifiableinformationencryptedbyonewayhashtransformation AT yangrong checkingquestionableentryofpersonallyidentifiableinformationencryptedbyonewayhashtransformation |