Cargando…

Quality evaluation of value sets from cancer study common data elements using the UMLS semantic groups

OBJECTIVE: The objective of this study is to develop an approach to evaluate the quality of terminological annotations on the value set (ie, enumerated value domain) components of the common data elements (CDEs) in the context of clinical research using both unified medical language system (UMLS) se...

Descripción completa

Detalles Bibliográficos
Autores principales: Jiang, Guoqian, Solbrig, Harold R, Chute, Christopher G
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Group 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3392855/
https://www.ncbi.nlm.nih.gov/pubmed/22511016
http://dx.doi.org/10.1136/amiajnl-2011-000739
_version_ 1782237659001782272
author Jiang, Guoqian
Solbrig, Harold R
Chute, Christopher G
author_facet Jiang, Guoqian
Solbrig, Harold R
Chute, Christopher G
author_sort Jiang, Guoqian
collection PubMed
description OBJECTIVE: The objective of this study is to develop an approach to evaluate the quality of terminological annotations on the value set (ie, enumerated value domain) components of the common data elements (CDEs) in the context of clinical research using both unified medical language system (UMLS) semantic types and groups. MATERIALS AND METHODS: The CDEs of the National Cancer Institute (NCI) Cancer Data Standards Repository, the NCI Thesaurus (NCIt) concepts and the UMLS semantic network were integrated using a semantic web-based framework for a SPARQL-enabled evaluation. First, the set of CDE-permissible values with corresponding meanings in external controlled terminologies were isolated. The corresponding value meanings were then evaluated against their NCI- or UMLS-generated semantic network mapping to determine whether all of the meanings fell within the same semantic group. RESULTS: Of the enumerated CDEs in the Cancer Data Standards Repository, 3093 (26.2%) had elements drawn from more than one UMLS semantic group. A random sample (n=100) of this set of elements indicated that 17% of them were likely to have been misclassified. DISCUSSION: The use of existing semantic web tools can support a high-throughput mechanism for evaluating the quality of large CDE collections. This study demonstrates that the involvement of multiple semantic groups in an enumerated value domain of a CDE is an effective anchor to trigger an auditing point for quality evaluation activities. CONCLUSION: This approach produces a useful quality assurance mechanism for a clinical study CDE repository.
format Online
Article
Text
id pubmed-3392855
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BMJ Group
record_format MEDLINE/PubMed
spelling pubmed-33928552012-07-10 Quality evaluation of value sets from cancer study common data elements using the UMLS semantic groups Jiang, Guoqian Solbrig, Harold R Chute, Christopher G J Am Med Inform Assoc Research and Applications OBJECTIVE: The objective of this study is to develop an approach to evaluate the quality of terminological annotations on the value set (ie, enumerated value domain) components of the common data elements (CDEs) in the context of clinical research using both unified medical language system (UMLS) semantic types and groups. MATERIALS AND METHODS: The CDEs of the National Cancer Institute (NCI) Cancer Data Standards Repository, the NCI Thesaurus (NCIt) concepts and the UMLS semantic network were integrated using a semantic web-based framework for a SPARQL-enabled evaluation. First, the set of CDE-permissible values with corresponding meanings in external controlled terminologies were isolated. The corresponding value meanings were then evaluated against their NCI- or UMLS-generated semantic network mapping to determine whether all of the meanings fell within the same semantic group. RESULTS: Of the enumerated CDEs in the Cancer Data Standards Repository, 3093 (26.2%) had elements drawn from more than one UMLS semantic group. A random sample (n=100) of this set of elements indicated that 17% of them were likely to have been misclassified. DISCUSSION: The use of existing semantic web tools can support a high-throughput mechanism for evaluating the quality of large CDE collections. This study demonstrates that the involvement of multiple semantic groups in an enumerated value domain of a CDE is an effective anchor to trigger an auditing point for quality evaluation activities. CONCLUSION: This approach produces a useful quality assurance mechanism for a clinical study CDE repository. BMJ Group 2012-04-17 2012-06 /pmc/articles/PMC3392855/ /pubmed/22511016 http://dx.doi.org/10.1136/amiajnl-2011-000739 Text en © 2012, Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions. This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode.
spellingShingle Research and Applications
Jiang, Guoqian
Solbrig, Harold R
Chute, Christopher G
Quality evaluation of value sets from cancer study common data elements using the UMLS semantic groups
title Quality evaluation of value sets from cancer study common data elements using the UMLS semantic groups
title_full Quality evaluation of value sets from cancer study common data elements using the UMLS semantic groups
title_fullStr Quality evaluation of value sets from cancer study common data elements using the UMLS semantic groups
title_full_unstemmed Quality evaluation of value sets from cancer study common data elements using the UMLS semantic groups
title_short Quality evaluation of value sets from cancer study common data elements using the UMLS semantic groups
title_sort quality evaluation of value sets from cancer study common data elements using the umls semantic groups
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3392855/
https://www.ncbi.nlm.nih.gov/pubmed/22511016
http://dx.doi.org/10.1136/amiajnl-2011-000739
work_keys_str_mv AT jiangguoqian qualityevaluationofvaluesetsfromcancerstudycommondataelementsusingtheumlssemanticgroups
AT solbrigharoldr qualityevaluationofvaluesetsfromcancerstudycommondataelementsusingtheumlssemanticgroups
AT chutechristopherg qualityevaluationofvaluesetsfromcancerstudycommondataelementsusingtheumlssemanticgroups