Cargando…

Can Unified Medical Language System–based semantic representation improve automated identification of patient safety incident reports by type and severity?

OBJECTIVE: The study sought to evaluate the feasibility of using Unified Medical Language System (UMLS) semantic features for automated identification of reports about patient safety incidents by type and severity. MATERIALS AND METHODS: Binary support vector machine (SVM) classifier ensembles were...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wang, Ying, Coiera, Enrico, Magrabi, Farah
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2020
Materias:	Research and Applications
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7566533/ https://www.ncbi.nlm.nih.gov/pubmed/32574362 http://dx.doi.org/10.1093/jamia/ocaa082

_version_	1783596149738831872
author	Wang, Ying Coiera, Enrico Magrabi, Farah
author_facet	Wang, Ying Coiera, Enrico Magrabi, Farah
author_sort	Wang, Ying
collection	PubMed
description	OBJECTIVE: The study sought to evaluate the feasibility of using Unified Medical Language System (UMLS) semantic features for automated identification of reports about patient safety incidents by type and severity. MATERIALS AND METHODS: Binary support vector machine (SVM) classifier ensembles were trained and validated using balanced datasets of critical incident report texts (n_type = 2860, n_severity = 1160) collected from a state-wide reporting system. Generalizability was evaluated on different and independent hospital-level reporting system. Concepts were extracted from report narratives using the UMLS Metathesaurus, and their relevance and frequency were used as semantic features. Performance was evaluated by F-score, Hamming loss, and exact match score and was compared with SVM ensembles using bag-of-words (BOW) features on 3 testing datasets (type/severity: n_benchmark = 286/116, n_original = 444/4837, n_independent =6000/5950). RESULTS: SVMs using semantic features met or outperformed those based on BOW features to identify 10 different incident types (F-score [semantics/BOW]: benchmark = 82.6%/69.4%; original = 77.9%/68.8%; independent = 78.0%/67.4%) and extreme-risk events (F-score [semantics/BOW]: benchmark = 87.3%/87.3%; original = 25.5%/19.8%; independent = 49.6%/52.7%). For incident type, the exact match score for semantic classifiers was consistently higher than BOW across all test datasets (exact match [semantics/BOW]: benchmark = 48.9%/39.9%; original = 57.9%/44.4%; independent = 59.5%/34.9%). DISCUSSION: BOW representations are not ideal for the automated identification of incident reports because they do not account for text semantics. UMLS semantic representations are likely to better capture information in report narratives, and thus may explain their superior performance. CONCLUSIONS: UMLS-based semantic classifiers were effective in identifying incidents by type and extreme-risk events, providing better generalizability than classifiers using BOW.
format	Online Article Text
id	pubmed-7566533
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-75665332020-10-20 Can Unified Medical Language System–based semantic representation improve automated identification of patient safety incident reports by type and severity? Wang, Ying Coiera, Enrico Magrabi, Farah J Am Med Inform Assoc Research and Applications OBJECTIVE: The study sought to evaluate the feasibility of using Unified Medical Language System (UMLS) semantic features for automated identification of reports about patient safety incidents by type and severity. MATERIALS AND METHODS: Binary support vector machine (SVM) classifier ensembles were trained and validated using balanced datasets of critical incident report texts (n_type = 2860, n_severity = 1160) collected from a state-wide reporting system. Generalizability was evaluated on different and independent hospital-level reporting system. Concepts were extracted from report narratives using the UMLS Metathesaurus, and their relevance and frequency were used as semantic features. Performance was evaluated by F-score, Hamming loss, and exact match score and was compared with SVM ensembles using bag-of-words (BOW) features on 3 testing datasets (type/severity: n_benchmark = 286/116, n_original = 444/4837, n_independent =6000/5950). RESULTS: SVMs using semantic features met or outperformed those based on BOW features to identify 10 different incident types (F-score [semantics/BOW]: benchmark = 82.6%/69.4%; original = 77.9%/68.8%; independent = 78.0%/67.4%) and extreme-risk events (F-score [semantics/BOW]: benchmark = 87.3%/87.3%; original = 25.5%/19.8%; independent = 49.6%/52.7%). For incident type, the exact match score for semantic classifiers was consistently higher than BOW across all test datasets (exact match [semantics/BOW]: benchmark = 48.9%/39.9%; original = 57.9%/44.4%; independent = 59.5%/34.9%). DISCUSSION: BOW representations are not ideal for the automated identification of incident reports because they do not account for text semantics. UMLS semantic representations are likely to better capture information in report narratives, and thus may explain their superior performance. CONCLUSIONS: UMLS-based semantic classifiers were effective in identifying incidents by type and extreme-risk events, providing better generalizability than classifiers using BOW. Oxford University Press 2020-06-18 /pmc/articles/PMC7566533/ /pubmed/32574362 http://dx.doi.org/10.1093/jamia/ocaa082 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Research and Applications Wang, Ying Coiera, Enrico Magrabi, Farah Can Unified Medical Language System–based semantic representation improve automated identification of patient safety incident reports by type and severity?
title	Can Unified Medical Language System–based semantic representation improve automated identification of patient safety incident reports by type and severity?
title_full	Can Unified Medical Language System–based semantic representation improve automated identification of patient safety incident reports by type and severity?
title_fullStr	Can Unified Medical Language System–based semantic representation improve automated identification of patient safety incident reports by type and severity?
title_full_unstemmed	Can Unified Medical Language System–based semantic representation improve automated identification of patient safety incident reports by type and severity?
title_short	Can Unified Medical Language System–based semantic representation improve automated identification of patient safety incident reports by type and severity?
title_sort	can unified medical language system–based semantic representation improve automated identification of patient safety incident reports by type and severity?
topic	Research and Applications
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7566533/ https://www.ncbi.nlm.nih.gov/pubmed/32574362 http://dx.doi.org/10.1093/jamia/ocaa082
work_keys_str_mv	AT wangying canunifiedmedicallanguagesystembasedsemanticrepresentationimproveautomatedidentificationofpatientsafetyincidentreportsbytypeandseverity AT coieraenrico canunifiedmedicallanguagesystembasedsemanticrepresentationimproveautomatedidentificationofpatientsafetyincidentreportsbytypeandseverity AT magrabifarah canunifiedmedicallanguagesystembasedsemanticrepresentationimproveautomatedidentificationofpatientsafetyincidentreportsbytypeandseverity

Can Unified Medical Language System–based semantic representation improve automated identification of patient safety incident reports by type and severity?

Ejemplares similares