Cargando…
Sculpting the UMLS Refined Semantic Network
BACKGROUND: The Refined Semantic Network (RSN) for the UMLS was previously introduced to complement the UMLS Semantic Network (SN). The RSN partitions the UMLS Metathesaurus (META) into disjoint groups of concepts. Each such group is semantically uniform. However, the RSN was initially an order of m...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
University of Illinois at Chicago Library
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4235323/ https://www.ncbi.nlm.nih.gov/pubmed/25422719 http://dx.doi.org/10.5210/ojphi.v6i2.5412 |
_version_ | 1782345008336076800 |
---|---|
author | He, Zhe Morrey, C. Paul Perl, Yehoshua Elhanan, Gai Chen, Ling Chen, Yan Geller, James |
author_facet | He, Zhe Morrey, C. Paul Perl, Yehoshua Elhanan, Gai Chen, Ling Chen, Yan Geller, James |
author_sort | He, Zhe |
collection | PubMed |
description | BACKGROUND: The Refined Semantic Network (RSN) for the UMLS was previously introduced to complement the UMLS Semantic Network (SN). The RSN partitions the UMLS Metathesaurus (META) into disjoint groups of concepts. Each such group is semantically uniform. However, the RSN was initially an order of magnitude larger than the SN, which is undesirable since to be useful, a semantic network should be compact. Most semantic types in the RSN represent combinations of semantic types in the UMLS SN. Such a “combination semantic type” is called Intersection Semantic Type (IST). Many ISTs are assigned to very few concepts. Moreover, when reviewing those concepts, many semantic type assignment inconsistencies were found. After correcting those inconsistencies many ISTs, among them some that contradicted UMLS rules, disappeared, which made the RSN smaller. OBJECTIVE: The authors performed a longitudinal study with the goal of reducing the size of the RSN to become compact. This goal was achieved by correcting inconsistencies and errors in the IST assignments in the UMLS, which additionally helped identify and correct ambiguities, inconsistencies, and errors in source terminologies widely used in the realm of public health. METHODS: In this paper, we discuss the process and steps employed in this longitudinal study and the intermediate results for different stages. The sculpting process includes removing redundant semantic type assignments, expanding semantic type assignments, and removing illegitimate ISTs by auditing ISTs of small extents. However, the emphasis of this paper is not on the auditing methodologies employed during the process, since they were introduced in earlier publications, but on the strategy of employing them in order to transform the RSN into a compact network. For this paper we also performed a comprehensive audit of 168 “small ISTs” in the 2013AA version of the UMLS to finalize the longitudinal study. RESULTS: Over the years it was found that the editors of the UMLS introduced some new inconsistencies that resulted in the reintroduction of unwarranted ISTs that had already been eliminated as a result of their previous corrections. Because of that, the transformation of the RSN into a compact network covering all necessary categories for the UMLS was slowed down. The corrections suggested by an audit of the 2013AA version of the UMLS achieve a compact RSN of equal magnitude as the UMLS SN. The number of ISTs has been reduced to 336. We also demonstrate how auditing the semantic type assignments of UMLS concepts can expose other modeling errors in the UMLS source terminologies, e.g., SNOMED CT, LOINC, and RxNORM that are important for health informatics. Such errors would otherwise stay hidden. CONCLUSIONS: It is hoped that the UMLS curators will implement all required corrections and use the RSN along with the SN when maintaining and extending the UMLS. When used correctly, the RSN will support the prevention of the accidental introduction of inconsistent semantic type assignments into the UMLS. Furthermore, this way the RSN will support the exposure of other hidden errors and inconsistencies in health informatics terminologies, which are sources of the UMLS. Notably, the development of the RSN materializes the deeper, more refined Semantic Network for the UMLS that its designers envisioned originally but had not implemented. |
format | Online Article Text |
id | pubmed-4235323 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | University of Illinois at Chicago Library |
record_format | MEDLINE/PubMed |
spelling | pubmed-42353232014-11-24 Sculpting the UMLS Refined Semantic Network He, Zhe Morrey, C. Paul Perl, Yehoshua Elhanan, Gai Chen, Ling Chen, Yan Geller, James Online J Public Health Inform Research Article BACKGROUND: The Refined Semantic Network (RSN) for the UMLS was previously introduced to complement the UMLS Semantic Network (SN). The RSN partitions the UMLS Metathesaurus (META) into disjoint groups of concepts. Each such group is semantically uniform. However, the RSN was initially an order of magnitude larger than the SN, which is undesirable since to be useful, a semantic network should be compact. Most semantic types in the RSN represent combinations of semantic types in the UMLS SN. Such a “combination semantic type” is called Intersection Semantic Type (IST). Many ISTs are assigned to very few concepts. Moreover, when reviewing those concepts, many semantic type assignment inconsistencies were found. After correcting those inconsistencies many ISTs, among them some that contradicted UMLS rules, disappeared, which made the RSN smaller. OBJECTIVE: The authors performed a longitudinal study with the goal of reducing the size of the RSN to become compact. This goal was achieved by correcting inconsistencies and errors in the IST assignments in the UMLS, which additionally helped identify and correct ambiguities, inconsistencies, and errors in source terminologies widely used in the realm of public health. METHODS: In this paper, we discuss the process and steps employed in this longitudinal study and the intermediate results for different stages. The sculpting process includes removing redundant semantic type assignments, expanding semantic type assignments, and removing illegitimate ISTs by auditing ISTs of small extents. However, the emphasis of this paper is not on the auditing methodologies employed during the process, since they were introduced in earlier publications, but on the strategy of employing them in order to transform the RSN into a compact network. For this paper we also performed a comprehensive audit of 168 “small ISTs” in the 2013AA version of the UMLS to finalize the longitudinal study. RESULTS: Over the years it was found that the editors of the UMLS introduced some new inconsistencies that resulted in the reintroduction of unwarranted ISTs that had already been eliminated as a result of their previous corrections. Because of that, the transformation of the RSN into a compact network covering all necessary categories for the UMLS was slowed down. The corrections suggested by an audit of the 2013AA version of the UMLS achieve a compact RSN of equal magnitude as the UMLS SN. The number of ISTs has been reduced to 336. We also demonstrate how auditing the semantic type assignments of UMLS concepts can expose other modeling errors in the UMLS source terminologies, e.g., SNOMED CT, LOINC, and RxNORM that are important for health informatics. Such errors would otherwise stay hidden. CONCLUSIONS: It is hoped that the UMLS curators will implement all required corrections and use the RSN along with the SN when maintaining and extending the UMLS. When used correctly, the RSN will support the prevention of the accidental introduction of inconsistent semantic type assignments into the UMLS. Furthermore, this way the RSN will support the exposure of other hidden errors and inconsistencies in health informatics terminologies, which are sources of the UMLS. Notably, the development of the RSN materializes the deeper, more refined Semantic Network for the UMLS that its designers envisioned originally but had not implemented. University of Illinois at Chicago Library 2014-10-16 /pmc/articles/PMC4235323/ /pubmed/25422719 http://dx.doi.org/10.5210/ojphi.v6i2.5412 Text en This is an Open Access article. Authors own copyright of their articles appearing in the Online Journal of Public Health Informatics. Readers may copy articles without permission of the copyright owner(s), as long as the author and OJPHI are acknowledged in the copy and the copy is used for educational, not-for-profit purposes. |
spellingShingle | Research Article He, Zhe Morrey, C. Paul Perl, Yehoshua Elhanan, Gai Chen, Ling Chen, Yan Geller, James Sculpting the UMLS Refined Semantic Network |
title | Sculpting the UMLS Refined Semantic Network |
title_full | Sculpting the UMLS Refined Semantic Network |
title_fullStr | Sculpting the UMLS Refined Semantic Network |
title_full_unstemmed | Sculpting the UMLS Refined Semantic Network |
title_short | Sculpting the UMLS Refined Semantic Network |
title_sort | sculpting the umls refined semantic network |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4235323/ https://www.ncbi.nlm.nih.gov/pubmed/25422719 http://dx.doi.org/10.5210/ojphi.v6i2.5412 |
work_keys_str_mv | AT hezhe sculptingtheumlsrefinedsemanticnetwork AT morreycpaul sculptingtheumlsrefinedsemanticnetwork AT perlyehoshua sculptingtheumlsrefinedsemanticnetwork AT elhanangai sculptingtheumlsrefinedsemanticnetwork AT chenling sculptingtheumlsrefinedsemanticnetwork AT chenyan sculptingtheumlsrefinedsemanticnetwork AT gellerjames sculptingtheumlsrefinedsemanticnetwork |