Cargando…

A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies

Background: Utilization of the available observational healthcare datasets is key to complement and strengthen the postmarketing safety studies. Use of common data models (CDM) is the predominant approach in order to enable large scale systematic analyses on disparate data models and vocabularies. C...

Descripción completa

Detalles Bibliográficos
Autores principales: Pacaci, Anil, Gonul, Suat, Sinaci, A. Anil, Yuksel, Mustafa, Laleci Erturkmen, Gokce B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5937227/
https://www.ncbi.nlm.nih.gov/pubmed/29760661
http://dx.doi.org/10.3389/fphar.2018.00435
_version_ 1783320592764633088
author Pacaci, Anil
Gonul, Suat
Sinaci, A. Anil
Yuksel, Mustafa
Laleci Erturkmen, Gokce B.
author_facet Pacaci, Anil
Gonul, Suat
Sinaci, A. Anil
Yuksel, Mustafa
Laleci Erturkmen, Gokce B.
author_sort Pacaci, Anil
collection PubMed
description Background: Utilization of the available observational healthcare datasets is key to complement and strengthen the postmarketing safety studies. Use of common data models (CDM) is the predominant approach in order to enable large scale systematic analyses on disparate data models and vocabularies. Current CDM transformation practices depend on proprietarily developed Extract—Transform—Load (ETL) procedures, which require knowledge both on the semantics and technical characteristics of the source datasets and target CDM. Purpose: In this study, our aim is to develop a modular but coordinated transformation approach in order to separate semantic and technical steps of transformation processes, which do not have a strict separation in traditional ETL approaches. Such an approach would discretize the operations to extract data from source electronic health record systems, alignment of the source, and target models on the semantic level and the operations to populate target common data repositories. Approach: In order to separate the activities that are required to transform heterogeneous data sources to a target CDM, we introduce a semantic transformation approach composed of three steps: (1) transformation of source datasets to Resource Description Framework (RDF) format, (2) application of semantic conversion rules to get the data as instances of ontological model of the target CDM, and (3) population of repositories, which comply with the specifications of the CDM, by processing the RDF instances from step 2. The proposed approach has been implemented on real healthcare settings where Observational Medical Outcomes Partnership (OMOP) CDM has been chosen as the common data model and a comprehensive comparative analysis between the native and transformed data has been conducted. Results: Health records of ~1 million patients have been successfully transformed to an OMOP CDM based database from the source database. Descriptive statistics obtained from the source and target databases present analogous and consistent results. Discussion and Conclusion: Our method goes beyond the traditional ETL approaches by being more declarative and rigorous. Declarative because the use of RDF based mapping rules makes each mapping more transparent and understandable to humans while retaining logic-based computability. Rigorous because the mappings would be based on computer readable semantics which are amenable to validation through logic-based inference methods.
format Online
Article
Text
id pubmed-5937227
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-59372272018-05-14 A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies Pacaci, Anil Gonul, Suat Sinaci, A. Anil Yuksel, Mustafa Laleci Erturkmen, Gokce B. Front Pharmacol Pharmacology Background: Utilization of the available observational healthcare datasets is key to complement and strengthen the postmarketing safety studies. Use of common data models (CDM) is the predominant approach in order to enable large scale systematic analyses on disparate data models and vocabularies. Current CDM transformation practices depend on proprietarily developed Extract—Transform—Load (ETL) procedures, which require knowledge both on the semantics and technical characteristics of the source datasets and target CDM. Purpose: In this study, our aim is to develop a modular but coordinated transformation approach in order to separate semantic and technical steps of transformation processes, which do not have a strict separation in traditional ETL approaches. Such an approach would discretize the operations to extract data from source electronic health record systems, alignment of the source, and target models on the semantic level and the operations to populate target common data repositories. Approach: In order to separate the activities that are required to transform heterogeneous data sources to a target CDM, we introduce a semantic transformation approach composed of three steps: (1) transformation of source datasets to Resource Description Framework (RDF) format, (2) application of semantic conversion rules to get the data as instances of ontological model of the target CDM, and (3) population of repositories, which comply with the specifications of the CDM, by processing the RDF instances from step 2. The proposed approach has been implemented on real healthcare settings where Observational Medical Outcomes Partnership (OMOP) CDM has been chosen as the common data model and a comprehensive comparative analysis between the native and transformed data has been conducted. Results: Health records of ~1 million patients have been successfully transformed to an OMOP CDM based database from the source database. Descriptive statistics obtained from the source and target databases present analogous and consistent results. Discussion and Conclusion: Our method goes beyond the traditional ETL approaches by being more declarative and rigorous. Declarative because the use of RDF based mapping rules makes each mapping more transparent and understandable to humans while retaining logic-based computability. Rigorous because the mappings would be based on computer readable semantics which are amenable to validation through logic-based inference methods. Frontiers Media S.A. 2018-04-30 /pmc/articles/PMC5937227/ /pubmed/29760661 http://dx.doi.org/10.3389/fphar.2018.00435 Text en Copyright © 2018 Pacaci, Gonul, Sinaci, Yuksel and Laleci Erturkmen. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Pharmacology
Pacaci, Anil
Gonul, Suat
Sinaci, A. Anil
Yuksel, Mustafa
Laleci Erturkmen, Gokce B.
A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies
title A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies
title_full A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies
title_fullStr A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies
title_full_unstemmed A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies
title_short A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies
title_sort semantic transformation methodology for the secondary use of observational healthcare data in postmarketing safety studies
topic Pharmacology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5937227/
https://www.ncbi.nlm.nih.gov/pubmed/29760661
http://dx.doi.org/10.3389/fphar.2018.00435
work_keys_str_mv AT pacacianil asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT gonulsuat asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT sinaciaanil asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT yukselmustafa asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT lalecierturkmengokceb asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT pacacianil semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT gonulsuat semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT sinaciaanil semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT yukselmustafa semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT lalecierturkmengokceb semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies