Cargando…

Towards quality improvement of vaccine concept mappings in the OMOP vocabulary with a semi-automated method

The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) provides a unified model to integrate disparate real-world data (RWD) sources. An integral part of the OMOP CDM is the Standardized Vocabularies (henceforth referred to as the OMOP vocabulary), which enables organization a...

Descripción completa

Detalles Bibliográficos
Autores principales: Abeysinghe, Rashmie, Black, Adam, Kaduk, Denys, Li, Yupeng, Reich, Christian, Davydov, Alexander, Yao, Lixia, Cui, Licong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9940475/
https://www.ncbi.nlm.nih.gov/pubmed/36029954
http://dx.doi.org/10.1016/j.jbi.2022.104162
_version_ 1784891085735067648
author Abeysinghe, Rashmie
Black, Adam
Kaduk, Denys
Li, Yupeng
Reich, Christian
Davydov, Alexander
Yao, Lixia
Cui, Licong
author_facet Abeysinghe, Rashmie
Black, Adam
Kaduk, Denys
Li, Yupeng
Reich, Christian
Davydov, Alexander
Yao, Lixia
Cui, Licong
author_sort Abeysinghe, Rashmie
collection PubMed
description The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) provides a unified model to integrate disparate real-world data (RWD) sources. An integral part of the OMOP CDM is the Standardized Vocabularies (henceforth referred to as the OMOP vocabulary), which enables organization and standardization of medical concepts across various clinical domains of the OMOP CDM. For concepts with the same meaning from different source vocabularies, one is designated as the standard concept, while the others are specified as non-standard or source concepts and mapped to the standard one. However, due to the heterogeneity of source vocabularies, there may exist mapping issues such as erroneous mappings and missing mappings in the OMOP vocabulary, which could affect the results of downstream analyses with RWD. In this paper, we focus on quality assurance of vaccine concept mappings in the OMOP vocabulary, which is necessary to accurately harness the power of RWD on vaccines. We introduce a semi-automated lexical approach to audit vaccine mappings in the OMOP vocabulary. We generated two types of vaccine-pairs: mapped and unmapped, where mapped vaccine-pairs are pairs of vaccine concepts with a “Maps to” relationship, while unmapped vaccine-pairs are those without a “Maps to” relationship. We represented each vaccine concept name as a set of words, and derived term-difference pairs (i.e., name differences) for mapped and unmapped vaccine-pairs. If the same term-difference pair can be obtained by both mapped and unmapped vaccine-pairs, then this is considered as a potential mapping inconsistency. Applying this approach to the vaccine mappings in OMOP, a total of 2087 potentially mapping inconsistencies were obtained. A randomly selected 200 samples were evaluated by domain experts to identify, validate, and categorize the inconsistencies. Experts identified 95 cases revealing valid mapping issues. The remaining 105 cases were found to be invalid due to the external and/or contextual information used in the mappings that were not reflected in the concept names of vaccines. This indicates that our semi-automated approach shows promise in identifying mapping inconsistencies among vaccine concepts in the OMOP vocabulary.
format Online
Article
Text
id pubmed-9940475
institution National Center for Biotechnology Information
language English
publishDate 2022
record_format MEDLINE/PubMed
spelling pubmed-99404752023-02-20 Towards quality improvement of vaccine concept mappings in the OMOP vocabulary with a semi-automated method Abeysinghe, Rashmie Black, Adam Kaduk, Denys Li, Yupeng Reich, Christian Davydov, Alexander Yao, Lixia Cui, Licong J Biomed Inform Article The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) provides a unified model to integrate disparate real-world data (RWD) sources. An integral part of the OMOP CDM is the Standardized Vocabularies (henceforth referred to as the OMOP vocabulary), which enables organization and standardization of medical concepts across various clinical domains of the OMOP CDM. For concepts with the same meaning from different source vocabularies, one is designated as the standard concept, while the others are specified as non-standard or source concepts and mapped to the standard one. However, due to the heterogeneity of source vocabularies, there may exist mapping issues such as erroneous mappings and missing mappings in the OMOP vocabulary, which could affect the results of downstream analyses with RWD. In this paper, we focus on quality assurance of vaccine concept mappings in the OMOP vocabulary, which is necessary to accurately harness the power of RWD on vaccines. We introduce a semi-automated lexical approach to audit vaccine mappings in the OMOP vocabulary. We generated two types of vaccine-pairs: mapped and unmapped, where mapped vaccine-pairs are pairs of vaccine concepts with a “Maps to” relationship, while unmapped vaccine-pairs are those without a “Maps to” relationship. We represented each vaccine concept name as a set of words, and derived term-difference pairs (i.e., name differences) for mapped and unmapped vaccine-pairs. If the same term-difference pair can be obtained by both mapped and unmapped vaccine-pairs, then this is considered as a potential mapping inconsistency. Applying this approach to the vaccine mappings in OMOP, a total of 2087 potentially mapping inconsistencies were obtained. A randomly selected 200 samples were evaluated by domain experts to identify, validate, and categorize the inconsistencies. Experts identified 95 cases revealing valid mapping issues. The remaining 105 cases were found to be invalid due to the external and/or contextual information used in the mappings that were not reflected in the concept names of vaccines. This indicates that our semi-automated approach shows promise in identifying mapping inconsistencies among vaccine concepts in the OMOP vocabulary. 2022-10 2022-08-25 /pmc/articles/PMC9940475/ /pubmed/36029954 http://dx.doi.org/10.1016/j.jbi.2022.104162 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/ (https://creativecommons.org/licenses/by-nc-nd/4.0/) ).
spellingShingle Article
Abeysinghe, Rashmie
Black, Adam
Kaduk, Denys
Li, Yupeng
Reich, Christian
Davydov, Alexander
Yao, Lixia
Cui, Licong
Towards quality improvement of vaccine concept mappings in the OMOP vocabulary with a semi-automated method
title Towards quality improvement of vaccine concept mappings in the OMOP vocabulary with a semi-automated method
title_full Towards quality improvement of vaccine concept mappings in the OMOP vocabulary with a semi-automated method
title_fullStr Towards quality improvement of vaccine concept mappings in the OMOP vocabulary with a semi-automated method
title_full_unstemmed Towards quality improvement of vaccine concept mappings in the OMOP vocabulary with a semi-automated method
title_short Towards quality improvement of vaccine concept mappings in the OMOP vocabulary with a semi-automated method
title_sort towards quality improvement of vaccine concept mappings in the omop vocabulary with a semi-automated method
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9940475/
https://www.ncbi.nlm.nih.gov/pubmed/36029954
http://dx.doi.org/10.1016/j.jbi.2022.104162
work_keys_str_mv AT abeysingherashmie towardsqualityimprovementofvaccineconceptmappingsintheomopvocabularywithasemiautomatedmethod
AT blackadam towardsqualityimprovementofvaccineconceptmappingsintheomopvocabularywithasemiautomatedmethod
AT kadukdenys towardsqualityimprovementofvaccineconceptmappingsintheomopvocabularywithasemiautomatedmethod
AT liyupeng towardsqualityimprovementofvaccineconceptmappingsintheomopvocabularywithasemiautomatedmethod
AT reichchristian towardsqualityimprovementofvaccineconceptmappingsintheomopvocabularywithasemiautomatedmethod
AT davydovalexander towardsqualityimprovementofvaccineconceptmappingsintheomopvocabularywithasemiautomatedmethod
AT yaolixia towardsqualityimprovementofvaccineconceptmappingsintheomopvocabularywithasemiautomatedmethod
AT cuilicong towardsqualityimprovementofvaccineconceptmappingsintheomopvocabularywithasemiautomatedmethod