Cargando…
Contextual property detection in Dutch diagnosis descriptions for uncertainty, laterality and temporality
BACKGROUND: Accurate, coded problem lists are valuable for data reuse, including clinical decision support and research. However, healthcare providers frequently modify coded diagnoses by including or removing common contextual properties in free-text diagnosis descriptions: uncertainty (suspected g...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8028823/ https://www.ncbi.nlm.nih.gov/pubmed/33827555 http://dx.doi.org/10.1186/s12911-021-01477-y |
_version_ | 1783676014197473280 |
---|---|
author | Klappe, Eva S. van Putten, Florentien J. P. de Keizer, Nicolette F. Cornet, Ronald |
author_facet | Klappe, Eva S. van Putten, Florentien J. P. de Keizer, Nicolette F. Cornet, Ronald |
author_sort | Klappe, Eva S. |
collection | PubMed |
description | BACKGROUND: Accurate, coded problem lists are valuable for data reuse, including clinical decision support and research. However, healthcare providers frequently modify coded diagnoses by including or removing common contextual properties in free-text diagnosis descriptions: uncertainty (suspected glaucoma), laterality (left glaucoma) and temporality (glaucoma 2002). These contextual properties could cause a difference in meaning between underlying diagnosis codes and modified descriptions, inhibiting data reuse. We therefore aimed to develop and evaluate an algorithm to identify these contextual properties. METHODS: A rule-based algorithm called UnLaTem (Uncertainty, Laterality, Temporality) was developed using a single-center dataset, including 288,935 diagnosis descriptions, of which 73,280 (25.4%) were modified by healthcare providers. Internal validation of the algorithm was conducted with an independent sample of 980 unique records. A second validation of the algorithm was conducted with 996 records from a Dutch multicenter dataset including 175,210 modified descriptions of five hospitals. Two researchers independently annotated the two validation samples. Performance of the algorithm was determined using means of the recall and precision of the validation samples. The algorithm was applied to the multicenter dataset to determine the actual prevalence of the contextual properties within the modified descriptions per specialty. RESULTS: For the single-center dataset recall (and precision) for removal of uncertainty, uncertainty, laterality and temporality respectively were 100 (60.0), 99.1 (89.9), 100 (97.3) and 97.6 (97.6). For the multicenter dataset for removal of uncertainty, uncertainty, laterality and temporality it was 57.1 (88.9), 86.3 (88.9), 99.7 (93.5) and 96.8 (90.1). Within the modified descriptions of the multicenter dataset, 1.3% contained removal of uncertainty, 9.9% uncertainty, 31.4% laterality and 9.8% temporality. CONCLUSIONS: We successfully developed a rule-based algorithm named UnLaTem to identify contextual properties in Dutch modified diagnosis descriptions. UnLaTem could be extended with more trigger terms, new rules and the recognition of term order to increase the performance even further. The algorithm’s rules are available as additional file 2. Implementing UnLaTem in Dutch hospital systems can improve precision of information retrieval and extraction from diagnosis descriptions, which can be used for data reuse purposes such as decision support and research. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-021-01477-y. |
format | Online Article Text |
id | pubmed-8028823 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-80288232021-04-09 Contextual property detection in Dutch diagnosis descriptions for uncertainty, laterality and temporality Klappe, Eva S. van Putten, Florentien J. P. de Keizer, Nicolette F. Cornet, Ronald BMC Med Inform Decis Mak Research Article BACKGROUND: Accurate, coded problem lists are valuable for data reuse, including clinical decision support and research. However, healthcare providers frequently modify coded diagnoses by including or removing common contextual properties in free-text diagnosis descriptions: uncertainty (suspected glaucoma), laterality (left glaucoma) and temporality (glaucoma 2002). These contextual properties could cause a difference in meaning between underlying diagnosis codes and modified descriptions, inhibiting data reuse. We therefore aimed to develop and evaluate an algorithm to identify these contextual properties. METHODS: A rule-based algorithm called UnLaTem (Uncertainty, Laterality, Temporality) was developed using a single-center dataset, including 288,935 diagnosis descriptions, of which 73,280 (25.4%) were modified by healthcare providers. Internal validation of the algorithm was conducted with an independent sample of 980 unique records. A second validation of the algorithm was conducted with 996 records from a Dutch multicenter dataset including 175,210 modified descriptions of five hospitals. Two researchers independently annotated the two validation samples. Performance of the algorithm was determined using means of the recall and precision of the validation samples. The algorithm was applied to the multicenter dataset to determine the actual prevalence of the contextual properties within the modified descriptions per specialty. RESULTS: For the single-center dataset recall (and precision) for removal of uncertainty, uncertainty, laterality and temporality respectively were 100 (60.0), 99.1 (89.9), 100 (97.3) and 97.6 (97.6). For the multicenter dataset for removal of uncertainty, uncertainty, laterality and temporality it was 57.1 (88.9), 86.3 (88.9), 99.7 (93.5) and 96.8 (90.1). Within the modified descriptions of the multicenter dataset, 1.3% contained removal of uncertainty, 9.9% uncertainty, 31.4% laterality and 9.8% temporality. CONCLUSIONS: We successfully developed a rule-based algorithm named UnLaTem to identify contextual properties in Dutch modified diagnosis descriptions. UnLaTem could be extended with more trigger terms, new rules and the recognition of term order to increase the performance even further. The algorithm’s rules are available as additional file 2. Implementing UnLaTem in Dutch hospital systems can improve precision of information retrieval and extraction from diagnosis descriptions, which can be used for data reuse purposes such as decision support and research. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-021-01477-y. BioMed Central 2021-04-07 /pmc/articles/PMC8028823/ /pubmed/33827555 http://dx.doi.org/10.1186/s12911-021-01477-y Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Klappe, Eva S. van Putten, Florentien J. P. de Keizer, Nicolette F. Cornet, Ronald Contextual property detection in Dutch diagnosis descriptions for uncertainty, laterality and temporality |
title | Contextual property detection in Dutch diagnosis descriptions for uncertainty, laterality and temporality |
title_full | Contextual property detection in Dutch diagnosis descriptions for uncertainty, laterality and temporality |
title_fullStr | Contextual property detection in Dutch diagnosis descriptions for uncertainty, laterality and temporality |
title_full_unstemmed | Contextual property detection in Dutch diagnosis descriptions for uncertainty, laterality and temporality |
title_short | Contextual property detection in Dutch diagnosis descriptions for uncertainty, laterality and temporality |
title_sort | contextual property detection in dutch diagnosis descriptions for uncertainty, laterality and temporality |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8028823/ https://www.ncbi.nlm.nih.gov/pubmed/33827555 http://dx.doi.org/10.1186/s12911-021-01477-y |
work_keys_str_mv | AT klappeevas contextualpropertydetectionindutchdiagnosisdescriptionsforuncertaintylateralityandtemporality AT vanputtenflorentienjp contextualpropertydetectionindutchdiagnosisdescriptionsforuncertaintylateralityandtemporality AT dekeizernicolettef contextualpropertydetectionindutchdiagnosisdescriptionsforuncertaintylateralityandtemporality AT cornetronald contextualpropertydetectionindutchdiagnosisdescriptionsforuncertaintylateralityandtemporality |