Cargando…

Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus

With 214 source vocabularies, the construction and maintenance process of the UMLS (Unified Medical Language System) Metathesaurus terminology integration system is costly, time-consuming, and error-prone as it primarily relies on (1) lexical and semantic processing for suggesting groupings of synon...

Descripción completa

Detalles Bibliográficos
Autores principales: Nguyen, Vinh, Yip, Hong Yung, Bodenreider, Olivier
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8434895/
https://www.ncbi.nlm.nih.gov/pubmed/34514472
http://dx.doi.org/10.1145/3442381.3450128
_version_ 1783751699920322560
author Nguyen, Vinh
Yip, Hong Yung
Bodenreider, Olivier
author_facet Nguyen, Vinh
Yip, Hong Yung
Bodenreider, Olivier
author_sort Nguyen, Vinh
collection PubMed
description With 214 source vocabularies, the construction and maintenance process of the UMLS (Unified Medical Language System) Metathesaurus terminology integration system is costly, time-consuming, and error-prone as it primarily relies on (1) lexical and semantic processing for suggesting groupings of synonymous terms, and (2) the expertise of UMLS editors for curating these synonymy predictions. This paper aims to improve the UMLS Metathesaurus construction process by developing a novel supervised learning approach for improving the task of suggesting synonymous pairs that can scale to the size and diversity of the UMLS source vocabularies. We evaluate this deep learning (DL) approach against a rule-based approach (RBA) that approximates the current UMLS Metathesaurus construction process. The key to the generalizability of our approach is the use of various degrees of lexical similarity in negative pairs during the training process. Our initial experiments demonstrate the strong performance across multiple datasets of our DL approach in terms of recall (91-92%), precision (88-99%), and F1 score (89-95%). Our DL approach largely outperforms the RBA method in recall (+23%), precision (+2.4%), and F1 score (+14.1%). This novel approach has great potential for improving the UMLS Metathesaurus construction process by providing better synonymy suggestions to the UMLS editors.
format Online
Article
Text
id pubmed-8434895
institution National Center for Biotechnology Information
language English
publishDate 2021
record_format MEDLINE/PubMed
spelling pubmed-84348952021-09-11 Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus Nguyen, Vinh Yip, Hong Yung Bodenreider, Olivier Proc Int World Wide Web Conf Article With 214 source vocabularies, the construction and maintenance process of the UMLS (Unified Medical Language System) Metathesaurus terminology integration system is costly, time-consuming, and error-prone as it primarily relies on (1) lexical and semantic processing for suggesting groupings of synonymous terms, and (2) the expertise of UMLS editors for curating these synonymy predictions. This paper aims to improve the UMLS Metathesaurus construction process by developing a novel supervised learning approach for improving the task of suggesting synonymous pairs that can scale to the size and diversity of the UMLS source vocabularies. We evaluate this deep learning (DL) approach against a rule-based approach (RBA) that approximates the current UMLS Metathesaurus construction process. The key to the generalizability of our approach is the use of various degrees of lexical similarity in negative pairs during the training process. Our initial experiments demonstrate the strong performance across multiple datasets of our DL approach in terms of recall (91-92%), precision (88-99%), and F1 score (89-95%). Our DL approach largely outperforms the RBA method in recall (+23%), precision (+2.4%), and F1 score (+14.1%). This novel approach has great potential for improving the UMLS Metathesaurus construction process by providing better synonymy suggestions to the UMLS editors. 2021-04-19 2021-04 /pmc/articles/PMC8434895/ /pubmed/34514472 http://dx.doi.org/10.1145/3442381.3450128 Text en https://creativecommons.org/licenses/by/4.0/This paper is published under the Creative Commons Attribution 4.0 International (CC-BY 4.0) license. Authors reserve their rights to disseminate the work on their personal and corporate Web sites with the appropriate attribution.
spellingShingle Article
Nguyen, Vinh
Yip, Hong Yung
Bodenreider, Olivier
Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus
title Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus
title_full Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus
title_fullStr Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus
title_full_unstemmed Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus
title_short Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus
title_sort biomedical vocabulary alignment at scale in the umls metathesaurus
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8434895/
https://www.ncbi.nlm.nih.gov/pubmed/34514472
http://dx.doi.org/10.1145/3442381.3450128
work_keys_str_mv AT nguyenvinh biomedicalvocabularyalignmentatscaleintheumlsmetathesaurus
AT yiphongyung biomedicalvocabularyalignmentatscaleintheumlsmetathesaurus
AT bodenreiderolivier biomedicalvocabularyalignmentatscaleintheumlsmetathesaurus