Cargando…

Clinical Term Normalization Using Learned Edit Patterns and Subconcept Matching: System Development and Evaluation

BACKGROUND: Clinical terms mentioned in clinical text are often not in their standardized forms as listed in clinical terminologies because of linguistic and stylistic variations. However, many automated downstream applications require clinical terms mapped to their corresponding concepts in clinica...

Descripción completa

Detalles Bibliográficos
Autor principal: Kate, Rohit J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7843202/
https://www.ncbi.nlm.nih.gov/pubmed/33443483
http://dx.doi.org/10.2196/23104
_version_ 1783644100857167872
author Kate, Rohit J
author_facet Kate, Rohit J
author_sort Kate, Rohit J
collection PubMed
description BACKGROUND: Clinical terms mentioned in clinical text are often not in their standardized forms as listed in clinical terminologies because of linguistic and stylistic variations. However, many automated downstream applications require clinical terms mapped to their corresponding concepts in clinical terminologies, thus necessitating the task of clinical term normalization. OBJECTIVE: In this paper, a system for clinical term normalization is presented that utilizes edit patterns to convert clinical terms into their normalized forms. METHODS: The edit patterns are automatically learned from the Unified Medical Language System (UMLS) Metathesaurus as well as from the given training data. The edit patterns are generalized sequences of edits that are derived from edit distance computations. The edit patterns are both character based as well as word based and are learned separately for different semantic types. In addition to these edit patterns, the system also normalizes clinical terms through the subconcepts mentioned within them. RESULTS: The system was evaluated as part of the 2019 n2c2 Track 3 shared task of clinical term normalization. It obtained 80.79% accuracy on the standard test data. This paper includes ablation studies to evaluate the contributions of different components of the system. A challenging part of the task was disambiguation when a clinical term could be normalized to multiple concepts. CONCLUSIONS: The learned edit patterns led the system to perform well on the normalization task. Given that the system is based on patterns, it is human interpretable and is also capable of giving insights about common variations of clinical terms mentioned in clinical text that are different from their standardized forms.
format Online
Article
Text
id pubmed-7843202
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-78432022021-02-01 Clinical Term Normalization Using Learned Edit Patterns and Subconcept Matching: System Development and Evaluation Kate, Rohit J JMIR Med Inform Original Paper BACKGROUND: Clinical terms mentioned in clinical text are often not in their standardized forms as listed in clinical terminologies because of linguistic and stylistic variations. However, many automated downstream applications require clinical terms mapped to their corresponding concepts in clinical terminologies, thus necessitating the task of clinical term normalization. OBJECTIVE: In this paper, a system for clinical term normalization is presented that utilizes edit patterns to convert clinical terms into their normalized forms. METHODS: The edit patterns are automatically learned from the Unified Medical Language System (UMLS) Metathesaurus as well as from the given training data. The edit patterns are generalized sequences of edits that are derived from edit distance computations. The edit patterns are both character based as well as word based and are learned separately for different semantic types. In addition to these edit patterns, the system also normalizes clinical terms through the subconcepts mentioned within them. RESULTS: The system was evaluated as part of the 2019 n2c2 Track 3 shared task of clinical term normalization. It obtained 80.79% accuracy on the standard test data. This paper includes ablation studies to evaluate the contributions of different components of the system. A challenging part of the task was disambiguation when a clinical term could be normalized to multiple concepts. CONCLUSIONS: The learned edit patterns led the system to perform well on the normalization task. Given that the system is based on patterns, it is human interpretable and is also capable of giving insights about common variations of clinical terms mentioned in clinical text that are different from their standardized forms. JMIR Publications 2021-01-14 /pmc/articles/PMC7843202/ /pubmed/33443483 http://dx.doi.org/10.2196/23104 Text en ©Rohit J Kate. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 14.01.2021. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Kate, Rohit J
Clinical Term Normalization Using Learned Edit Patterns and Subconcept Matching: System Development and Evaluation
title Clinical Term Normalization Using Learned Edit Patterns and Subconcept Matching: System Development and Evaluation
title_full Clinical Term Normalization Using Learned Edit Patterns and Subconcept Matching: System Development and Evaluation
title_fullStr Clinical Term Normalization Using Learned Edit Patterns and Subconcept Matching: System Development and Evaluation
title_full_unstemmed Clinical Term Normalization Using Learned Edit Patterns and Subconcept Matching: System Development and Evaluation
title_short Clinical Term Normalization Using Learned Edit Patterns and Subconcept Matching: System Development and Evaluation
title_sort clinical term normalization using learned edit patterns and subconcept matching: system development and evaluation
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7843202/
https://www.ncbi.nlm.nih.gov/pubmed/33443483
http://dx.doi.org/10.2196/23104
work_keys_str_mv AT katerohitj clinicaltermnormalizationusinglearnededitpatternsandsubconceptmatchingsystemdevelopmentandevaluation