Cargando…

Transforming and evaluating the UK Biobank to the OMOP Common Data Model for COVID-19 research and beyond

OBJECTIVE: The coronavirus disease 2019 (COVID-19) pandemic has demonstrated the value of real-world data for public health research. International federated analyses are crucial for informing policy makers. Common data models (CDMs) are critical for enabling these studies to be performed efficientl...

Descripción completa

Detalles Bibliográficos
Autores principales: Papez, Vaclav, Moinat, Maxim, Voss, Erica A, Bazakou, Sofia, Van Winzum, Anne, Peviani, Alessia, Payralbe, Stefan, Lara, Elena Garcia, Kallfelz, Michael, Asselbergs, Folkert W, Prieto-Alhambra, Daniel, Dobson, Richard J B, Denaxas, Spiros
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9619789/
https://www.ncbi.nlm.nih.gov/pubmed/36227072
http://dx.doi.org/10.1093/jamia/ocac203
_version_ 1784821283943350272
author Papez, Vaclav
Moinat, Maxim
Voss, Erica A
Bazakou, Sofia
Van Winzum, Anne
Peviani, Alessia
Payralbe, Stefan
Lara, Elena Garcia
Kallfelz, Michael
Asselbergs, Folkert W
Prieto-Alhambra, Daniel
Dobson, Richard J B
Denaxas, Spiros
author_facet Papez, Vaclav
Moinat, Maxim
Voss, Erica A
Bazakou, Sofia
Van Winzum, Anne
Peviani, Alessia
Payralbe, Stefan
Lara, Elena Garcia
Kallfelz, Michael
Asselbergs, Folkert W
Prieto-Alhambra, Daniel
Dobson, Richard J B
Denaxas, Spiros
author_sort Papez, Vaclav
collection PubMed
description OBJECTIVE: The coronavirus disease 2019 (COVID-19) pandemic has demonstrated the value of real-world data for public health research. International federated analyses are crucial for informing policy makers. Common data models (CDMs) are critical for enabling these studies to be performed efficiently. Our objective was to convert the UK Biobank, a study of 500 000 participants with rich genetic and phenotypic data to the Observational Medical Outcomes Partnership (OMOP) CDM. MATERIALS AND METHODS: We converted UK Biobank data to OMOP CDM v. 5.3. We transformedparticipant research data on diseases collected at recruitment and electronic health records (EHRs) from primary care, hospitalizations, cancer registrations, and mortality from providers in England, Scotland, and Wales. We performed syntactic and semantic validations and compared comorbidities and risk factors between source and transformed data. RESULTS: We identified 502 505 participants (3086 with COVID-19) and transformed 690 fields (1 373 239 555 rows) to the OMOP CDM using 8 different controlled clinical terminologies and bespoke mappings. Specifically, we transformed self-reported noncancer illnesses 946 053 (83.91% of all source entries), cancers 37 802 (70.81%), medications 1 218 935 (88.25%), and prescriptions 864 788 (86.96%). In EHR, we transformed 13 028 182 (99.95%) hospital diagnoses, 6 465 399 (89.2%) procedures, 337 896 333 primary care diagnoses (CTV3, SNOMED-CT), 139 966 587 (98.74%) prescriptions (dm+d) and 77 127 (99.95%) deaths (ICD-10). We observed good concordance across demographic, risk factor, and comorbidity factors between source and transformed data. DISCUSSION AND CONCLUSION: Our study demonstrated that the OMOP CDM can be successfully leveraged to harmonize complex large-scale biobanked studies combining rich multimodal phenotypic data. Our study uncovered several challenges when transforming data from questionnaires to the OMOP CDM which require further research. The transformed UK Biobank resource is a valuable tool that can enable federated research, like COVID-19 studies.
format Online
Article
Text
id pubmed-9619789
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-96197892022-11-04 Transforming and evaluating the UK Biobank to the OMOP Common Data Model for COVID-19 research and beyond Papez, Vaclav Moinat, Maxim Voss, Erica A Bazakou, Sofia Van Winzum, Anne Peviani, Alessia Payralbe, Stefan Lara, Elena Garcia Kallfelz, Michael Asselbergs, Folkert W Prieto-Alhambra, Daniel Dobson, Richard J B Denaxas, Spiros J Am Med Inform Assoc Research and Applications OBJECTIVE: The coronavirus disease 2019 (COVID-19) pandemic has demonstrated the value of real-world data for public health research. International federated analyses are crucial for informing policy makers. Common data models (CDMs) are critical for enabling these studies to be performed efficiently. Our objective was to convert the UK Biobank, a study of 500 000 participants with rich genetic and phenotypic data to the Observational Medical Outcomes Partnership (OMOP) CDM. MATERIALS AND METHODS: We converted UK Biobank data to OMOP CDM v. 5.3. We transformedparticipant research data on diseases collected at recruitment and electronic health records (EHRs) from primary care, hospitalizations, cancer registrations, and mortality from providers in England, Scotland, and Wales. We performed syntactic and semantic validations and compared comorbidities and risk factors between source and transformed data. RESULTS: We identified 502 505 participants (3086 with COVID-19) and transformed 690 fields (1 373 239 555 rows) to the OMOP CDM using 8 different controlled clinical terminologies and bespoke mappings. Specifically, we transformed self-reported noncancer illnesses 946 053 (83.91% of all source entries), cancers 37 802 (70.81%), medications 1 218 935 (88.25%), and prescriptions 864 788 (86.96%). In EHR, we transformed 13 028 182 (99.95%) hospital diagnoses, 6 465 399 (89.2%) procedures, 337 896 333 primary care diagnoses (CTV3, SNOMED-CT), 139 966 587 (98.74%) prescriptions (dm+d) and 77 127 (99.95%) deaths (ICD-10). We observed good concordance across demographic, risk factor, and comorbidity factors between source and transformed data. DISCUSSION AND CONCLUSION: Our study demonstrated that the OMOP CDM can be successfully leveraged to harmonize complex large-scale biobanked studies combining rich multimodal phenotypic data. Our study uncovered several challenges when transforming data from questionnaires to the OMOP CDM which require further research. The transformed UK Biobank resource is a valuable tool that can enable federated research, like COVID-19 studies. Oxford University Press 2022-10-13 /pmc/articles/PMC9619789/ /pubmed/36227072 http://dx.doi.org/10.1093/jamia/ocac203 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research and Applications
Papez, Vaclav
Moinat, Maxim
Voss, Erica A
Bazakou, Sofia
Van Winzum, Anne
Peviani, Alessia
Payralbe, Stefan
Lara, Elena Garcia
Kallfelz, Michael
Asselbergs, Folkert W
Prieto-Alhambra, Daniel
Dobson, Richard J B
Denaxas, Spiros
Transforming and evaluating the UK Biobank to the OMOP Common Data Model for COVID-19 research and beyond
title Transforming and evaluating the UK Biobank to the OMOP Common Data Model for COVID-19 research and beyond
title_full Transforming and evaluating the UK Biobank to the OMOP Common Data Model for COVID-19 research and beyond
title_fullStr Transforming and evaluating the UK Biobank to the OMOP Common Data Model for COVID-19 research and beyond
title_full_unstemmed Transforming and evaluating the UK Biobank to the OMOP Common Data Model for COVID-19 research and beyond
title_short Transforming and evaluating the UK Biobank to the OMOP Common Data Model for COVID-19 research and beyond
title_sort transforming and evaluating the uk biobank to the omop common data model for covid-19 research and beyond
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9619789/
https://www.ncbi.nlm.nih.gov/pubmed/36227072
http://dx.doi.org/10.1093/jamia/ocac203
work_keys_str_mv AT papezvaclav transformingandevaluatingtheukbiobanktotheomopcommondatamodelforcovid19researchandbeyond
AT moinatmaxim transformingandevaluatingtheukbiobanktotheomopcommondatamodelforcovid19researchandbeyond
AT vossericaa transformingandevaluatingtheukbiobanktotheomopcommondatamodelforcovid19researchandbeyond
AT bazakousofia transformingandevaluatingtheukbiobanktotheomopcommondatamodelforcovid19researchandbeyond
AT vanwinzumanne transformingandevaluatingtheukbiobanktotheomopcommondatamodelforcovid19researchandbeyond
AT pevianialessia transformingandevaluatingtheukbiobanktotheomopcommondatamodelforcovid19researchandbeyond
AT payralbestefan transformingandevaluatingtheukbiobanktotheomopcommondatamodelforcovid19researchandbeyond
AT laraelenagarcia transformingandevaluatingtheukbiobanktotheomopcommondatamodelforcovid19researchandbeyond
AT kallfelzmichael transformingandevaluatingtheukbiobanktotheomopcommondatamodelforcovid19researchandbeyond
AT asselbergsfolkertw transformingandevaluatingtheukbiobanktotheomopcommondatamodelforcovid19researchandbeyond
AT prietoalhambradaniel transformingandevaluatingtheukbiobanktotheomopcommondatamodelforcovid19researchandbeyond
AT dobsonrichardjb transformingandevaluatingtheukbiobanktotheomopcommondatamodelforcovid19researchandbeyond
AT denaxasspiros transformingandevaluatingtheukbiobanktotheomopcommondatamodelforcovid19researchandbeyond