Cargando…

Development and validation of the SickKids Enterprise-wide Data in Azure Repository (SEDAR)

OBJECTIVES: To describe the processes developed by The Hospital for Sick Children (SickKids) to enable utilization of electronic health record (EHR) data by creating sequentially transformed schemas for use across multiple user types. METHODS: We used Microsoft Azure as the cloud service provider an...

Descripción completa

Detalles Bibliográficos
Autores principales: Guo, Lin Lawrence, Calligan, Maryann, Vettese, Emily, Cook, Sadie, Gagnidze, George, Han, Oscar, Inoue, Jiro, Lemmon, Joshua, Li, Johnson, Roshdi, Medhat, Sadovy, Bohdan, Wallace, Steven, Sung, Lillian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10661187/
https://www.ncbi.nlm.nih.gov/pubmed/38027579
http://dx.doi.org/10.1016/j.heliyon.2023.e21586
_version_ 1785137917999448064
author Guo, Lin Lawrence
Calligan, Maryann
Vettese, Emily
Cook, Sadie
Gagnidze, George
Han, Oscar
Inoue, Jiro
Lemmon, Joshua
Li, Johnson
Roshdi, Medhat
Sadovy, Bohdan
Wallace, Steven
Sung, Lillian
author_facet Guo, Lin Lawrence
Calligan, Maryann
Vettese, Emily
Cook, Sadie
Gagnidze, George
Han, Oscar
Inoue, Jiro
Lemmon, Joshua
Li, Johnson
Roshdi, Medhat
Sadovy, Bohdan
Wallace, Steven
Sung, Lillian
author_sort Guo, Lin Lawrence
collection PubMed
description OBJECTIVES: To describe the processes developed by The Hospital for Sick Children (SickKids) to enable utilization of electronic health record (EHR) data by creating sequentially transformed schemas for use across multiple user types. METHODS: We used Microsoft Azure as the cloud service provider and named this effort the SickKids Enterprise-wide Data in Azure Repository (SEDAR). Epic Clarity data from on-premises was copied to a virtual network in Microsoft Azure. Three sequential schemas were developed. The Filtered Schema added a filter to retain only SickKids and valid patients. The Curated Schema created a data structure that was easier to navigate and query. Each table contained a logical unit such as patients, hospital encounters or laboratory tests. Data validation of randomly sampled observations in the Curated Schema was performed. The SK-OMOP Schema was designed to facilitate research and machine learning. Two individuals mapped medical elements to standard Observational Medical Outcomes Partnership (OMOP) concepts. RESULTS: A copy of Clarity data was transferred to Microsoft Azure and updated each night using log shipping. The Filtered Schema and Curated Schema were implemented as stored procedures and executed each night with incremental updates or full loads. Data validation required up to 16 iterations for each Curated Schema table. OMOP concept mapping achieved at least 80 % coverage for each SK-OMOP table. CONCLUSIONS: We described our experience in creating three sequential schemas to address different EHR data access requirements. Future work should consider replicating this approach at other institutions to determine whether approaches are generalizable.
format Online
Article
Text
id pubmed-10661187
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-106611872023-11-02 Development and validation of the SickKids Enterprise-wide Data in Azure Repository (SEDAR) Guo, Lin Lawrence Calligan, Maryann Vettese, Emily Cook, Sadie Gagnidze, George Han, Oscar Inoue, Jiro Lemmon, Joshua Li, Johnson Roshdi, Medhat Sadovy, Bohdan Wallace, Steven Sung, Lillian Heliyon Research Article OBJECTIVES: To describe the processes developed by The Hospital for Sick Children (SickKids) to enable utilization of electronic health record (EHR) data by creating sequentially transformed schemas for use across multiple user types. METHODS: We used Microsoft Azure as the cloud service provider and named this effort the SickKids Enterprise-wide Data in Azure Repository (SEDAR). Epic Clarity data from on-premises was copied to a virtual network in Microsoft Azure. Three sequential schemas were developed. The Filtered Schema added a filter to retain only SickKids and valid patients. The Curated Schema created a data structure that was easier to navigate and query. Each table contained a logical unit such as patients, hospital encounters or laboratory tests. Data validation of randomly sampled observations in the Curated Schema was performed. The SK-OMOP Schema was designed to facilitate research and machine learning. Two individuals mapped medical elements to standard Observational Medical Outcomes Partnership (OMOP) concepts. RESULTS: A copy of Clarity data was transferred to Microsoft Azure and updated each night using log shipping. The Filtered Schema and Curated Schema were implemented as stored procedures and executed each night with incremental updates or full loads. Data validation required up to 16 iterations for each Curated Schema table. OMOP concept mapping achieved at least 80 % coverage for each SK-OMOP table. CONCLUSIONS: We described our experience in creating three sequential schemas to address different EHR data access requirements. Future work should consider replicating this approach at other institutions to determine whether approaches are generalizable. Elsevier 2023-11-02 /pmc/articles/PMC10661187/ /pubmed/38027579 http://dx.doi.org/10.1016/j.heliyon.2023.e21586 Text en © 2023 The Authors. Published by Elsevier Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Research Article
Guo, Lin Lawrence
Calligan, Maryann
Vettese, Emily
Cook, Sadie
Gagnidze, George
Han, Oscar
Inoue, Jiro
Lemmon, Joshua
Li, Johnson
Roshdi, Medhat
Sadovy, Bohdan
Wallace, Steven
Sung, Lillian
Development and validation of the SickKids Enterprise-wide Data in Azure Repository (SEDAR)
title Development and validation of the SickKids Enterprise-wide Data in Azure Repository (SEDAR)
title_full Development and validation of the SickKids Enterprise-wide Data in Azure Repository (SEDAR)
title_fullStr Development and validation of the SickKids Enterprise-wide Data in Azure Repository (SEDAR)
title_full_unstemmed Development and validation of the SickKids Enterprise-wide Data in Azure Repository (SEDAR)
title_short Development and validation of the SickKids Enterprise-wide Data in Azure Repository (SEDAR)
title_sort development and validation of the sickkids enterprise-wide data in azure repository (sedar)
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10661187/
https://www.ncbi.nlm.nih.gov/pubmed/38027579
http://dx.doi.org/10.1016/j.heliyon.2023.e21586
work_keys_str_mv AT guolinlawrence developmentandvalidationofthesickkidsenterprisewidedatainazurerepositorysedar
AT calliganmaryann developmentandvalidationofthesickkidsenterprisewidedatainazurerepositorysedar
AT vetteseemily developmentandvalidationofthesickkidsenterprisewidedatainazurerepositorysedar
AT cooksadie developmentandvalidationofthesickkidsenterprisewidedatainazurerepositorysedar
AT gagnidzegeorge developmentandvalidationofthesickkidsenterprisewidedatainazurerepositorysedar
AT hanoscar developmentandvalidationofthesickkidsenterprisewidedatainazurerepositorysedar
AT inouejiro developmentandvalidationofthesickkidsenterprisewidedatainazurerepositorysedar
AT lemmonjoshua developmentandvalidationofthesickkidsenterprisewidedatainazurerepositorysedar
AT lijohnson developmentandvalidationofthesickkidsenterprisewidedatainazurerepositorysedar
AT roshdimedhat developmentandvalidationofthesickkidsenterprisewidedatainazurerepositorysedar
AT sadovybohdan developmentandvalidationofthesickkidsenterprisewidedatainazurerepositorysedar
AT wallacesteven developmentandvalidationofthesickkidsenterprisewidedatainazurerepositorysedar
AT sunglillian developmentandvalidationofthesickkidsenterprisewidedatainazurerepositorysedar