Cargando…

API Driven On-Demand Participant ID Pseudonymization in Heterogeneous Multi-Study Research

OBJECTIVES: To facilitate clinical and translational research, imaging and non-imaging clinical data from multiple disparate systems must be aggregated for analysis. Study participant records from various sources are linked together and to patient records when possible to address research questions...

Descripción completa

Detalles Bibliográficos
Autores principales: Syed, Shorabuddin, Syed, Mahanazuddin, Syeda, Hafsa Bareen, Garza, Maryam, Bennett, William, Bona, Jonathan, Begum, Salma, Baghal, Ahmad, Zozus, Meredith, Prior, Fred
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Korean Society of Medical Informatics 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7921568/
https://www.ncbi.nlm.nih.gov/pubmed/33611875
http://dx.doi.org/10.4258/hir.2021.27.1.39
_version_ 1783658493641752576
author Syed, Shorabuddin
Syed, Mahanazuddin
Syeda, Hafsa Bareen
Garza, Maryam
Bennett, William
Bona, Jonathan
Begum, Salma
Baghal, Ahmad
Zozus, Meredith
Prior, Fred
author_facet Syed, Shorabuddin
Syed, Mahanazuddin
Syeda, Hafsa Bareen
Garza, Maryam
Bennett, William
Bona, Jonathan
Begum, Salma
Baghal, Ahmad
Zozus, Meredith
Prior, Fred
author_sort Syed, Shorabuddin
collection PubMed
description OBJECTIVES: To facilitate clinical and translational research, imaging and non-imaging clinical data from multiple disparate systems must be aggregated for analysis. Study participant records from various sources are linked together and to patient records when possible to address research questions while ensuring patient privacy. This paper presents a novel tool that pseudonymizes participant identifiers (PIDs) using a researcher-driven automated process that takes advantage of application-programming interface (API) and the Perl Open-Source Digital Imaging and Communications in Medicine Archive (POSDA) to further de-identify PIDs. The tool, on-demand cohort and API participant identifier pseudonymization (O-CAPP), employs a pseudonymization method based on the type of incoming research data. METHODS: For images, pseudonymization of PIDs is done using API calls that receive PIDs present in Digital Imaging and Communications in Medicine (DICOM) headers and returns the pseudonymized identifiers. For non-imaging clinical research data, PIDs provided by study principal investigators (PIs) are pseudonymized using a nightly automated process. The pseudonymized PIDs (P-PIDs) along with other protected health information is further de-identified using POSDA. RESULTS: A sample of 250 PIDs pseudonymized by O-CAPP were selected and successfully validated. Of those, 125 PIDs that were pseudonymized by the nightly automated process were validated by multiple clinical trial investigators (CTIs). For the other 125, CTIs validated radiologic image pseudonymization by API request based on the provided PID and P-PID mappings. CONCLUSIONS: We developed a novel approach of an on-demand pseudonymization process that will aide researchers in obtaining a comprehensive and holistic view of study participant data without compromising patient privacy.
format Online
Article
Text
id pubmed-7921568
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Korean Society of Medical Informatics
record_format MEDLINE/PubMed
spelling pubmed-79215682021-03-04 API Driven On-Demand Participant ID Pseudonymization in Heterogeneous Multi-Study Research Syed, Shorabuddin Syed, Mahanazuddin Syeda, Hafsa Bareen Garza, Maryam Bennett, William Bona, Jonathan Begum, Salma Baghal, Ahmad Zozus, Meredith Prior, Fred Healthc Inform Res Original Article OBJECTIVES: To facilitate clinical and translational research, imaging and non-imaging clinical data from multiple disparate systems must be aggregated for analysis. Study participant records from various sources are linked together and to patient records when possible to address research questions while ensuring patient privacy. This paper presents a novel tool that pseudonymizes participant identifiers (PIDs) using a researcher-driven automated process that takes advantage of application-programming interface (API) and the Perl Open-Source Digital Imaging and Communications in Medicine Archive (POSDA) to further de-identify PIDs. The tool, on-demand cohort and API participant identifier pseudonymization (O-CAPP), employs a pseudonymization method based on the type of incoming research data. METHODS: For images, pseudonymization of PIDs is done using API calls that receive PIDs present in Digital Imaging and Communications in Medicine (DICOM) headers and returns the pseudonymized identifiers. For non-imaging clinical research data, PIDs provided by study principal investigators (PIs) are pseudonymized using a nightly automated process. The pseudonymized PIDs (P-PIDs) along with other protected health information is further de-identified using POSDA. RESULTS: A sample of 250 PIDs pseudonymized by O-CAPP were selected and successfully validated. Of those, 125 PIDs that were pseudonymized by the nightly automated process were validated by multiple clinical trial investigators (CTIs). For the other 125, CTIs validated radiologic image pseudonymization by API request based on the provided PID and P-PID mappings. CONCLUSIONS: We developed a novel approach of an on-demand pseudonymization process that will aide researchers in obtaining a comprehensive and holistic view of study participant data without compromising patient privacy. Korean Society of Medical Informatics 2021-01 2021-01-31 /pmc/articles/PMC7921568/ /pubmed/33611875 http://dx.doi.org/10.4258/hir.2021.27.1.39 Text en © 2021 The Korean Society of Medical Informatics This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Syed, Shorabuddin
Syed, Mahanazuddin
Syeda, Hafsa Bareen
Garza, Maryam
Bennett, William
Bona, Jonathan
Begum, Salma
Baghal, Ahmad
Zozus, Meredith
Prior, Fred
API Driven On-Demand Participant ID Pseudonymization in Heterogeneous Multi-Study Research
title API Driven On-Demand Participant ID Pseudonymization in Heterogeneous Multi-Study Research
title_full API Driven On-Demand Participant ID Pseudonymization in Heterogeneous Multi-Study Research
title_fullStr API Driven On-Demand Participant ID Pseudonymization in Heterogeneous Multi-Study Research
title_full_unstemmed API Driven On-Demand Participant ID Pseudonymization in Heterogeneous Multi-Study Research
title_short API Driven On-Demand Participant ID Pseudonymization in Heterogeneous Multi-Study Research
title_sort api driven on-demand participant id pseudonymization in heterogeneous multi-study research
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7921568/
https://www.ncbi.nlm.nih.gov/pubmed/33611875
http://dx.doi.org/10.4258/hir.2021.27.1.39
work_keys_str_mv AT syedshorabuddin apidrivenondemandparticipantidpseudonymizationinheterogeneousmultistudyresearch
AT syedmahanazuddin apidrivenondemandparticipantidpseudonymizationinheterogeneousmultistudyresearch
AT syedahafsabareen apidrivenondemandparticipantidpseudonymizationinheterogeneousmultistudyresearch
AT garzamaryam apidrivenondemandparticipantidpseudonymizationinheterogeneousmultistudyresearch
AT bennettwilliam apidrivenondemandparticipantidpseudonymizationinheterogeneousmultistudyresearch
AT bonajonathan apidrivenondemandparticipantidpseudonymizationinheterogeneousmultistudyresearch
AT begumsalma apidrivenondemandparticipantidpseudonymizationinheterogeneousmultistudyresearch
AT baghalahmad apidrivenondemandparticipantidpseudonymizationinheterogeneousmultistudyresearch
AT zozusmeredith apidrivenondemandparticipantidpseudonymizationinheterogeneousmultistudyresearch
AT priorfred apidrivenondemandparticipantidpseudonymizationinheterogeneousmultistudyresearch