Cargando…

Automatic Annotation of Narrative Radiology Reports

Narrative texts in electronic health records can be efficiently utilized for building decision support systems in the clinic, only if they are correctly interpreted automatically in accordance with a specified standard. This paper tackles the problem of developing an automated method of labeling fre...

Descripción completa

Detalles Bibliográficos
Autores principales:	Krsnik, Ivan, Glavaš, Goran, Krsnik, Marina, Miletić, Damir, Štajduhar, Ivan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7235892/ https://www.ncbi.nlm.nih.gov/pubmed/32244833 http://dx.doi.org/10.3390/diagnostics10040196

_version_	1783536060468297728
author	Krsnik, Ivan Glavaš, Goran Krsnik, Marina Miletić, Damir Štajduhar, Ivan
author_facet	Krsnik, Ivan Glavaš, Goran Krsnik, Marina Miletić, Damir Štajduhar, Ivan
author_sort	Krsnik, Ivan
collection	PubMed
description	Narrative texts in electronic health records can be efficiently utilized for building decision support systems in the clinic, only if they are correctly interpreted automatically in accordance with a specified standard. This paper tackles the problem of developing an automated method of labeling free-form radiology reports, as a precursor for building query-capable report databases in hospitals. The analyzed dataset consists of 1295 radiology reports concerning the condition of a knee, retrospectively gathered at the Clinical Hospital Centre Rijeka, Croatia. Reports were manually labeled with one or more labels from a set of 10 most commonly occurring clinical conditions. After primary preprocessing of the texts, two sets of text classification methods were compared: (1) traditional classification models—Naive Bayes (NB), Logistic Regression (LR), Support Vector Machine (SVM), and Random Forests (RF)—coupled with Bag-of-Words (BoW) features (i.e., symbolic text representation) and (2) Convolutional Neural Network (CNN) coupled with dense word vectors (i.e., word embeddings as a semantic text representation) as input features. We resorted to nested 10-fold cross-validation to evaluate the performance of competing methods using accuracy, precision, recall, and [Formula: see text] score. The CNN with semantic word representations as input yielded the overall best performance, having a micro-averaged [Formula: see text] score of [Formula: see text]. The CNN classifier yielded particularly encouraging results for the most represented conditions: degenerative disease ([Formula: see text]), arthrosis ([Formula: see text]), and injury ([Formula: see text]). As a data-hungry deep learning model, the CNN, however, performed notably worse than the competing models on underrepresented classes with fewer training instances such as multicausal disease or metabolic disease. LR, RF, and SVM performed comparably well, with the obtained micro-averaged [Formula: see text] scores of [Formula: see text] , [Formula: see text] , and [Formula: see text] , respectively.
format	Online Article Text
id	pubmed-7235892
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-72358922020-05-28 Automatic Annotation of Narrative Radiology Reports Krsnik, Ivan Glavaš, Goran Krsnik, Marina Miletić, Damir Štajduhar, Ivan Diagnostics (Basel) Article Narrative texts in electronic health records can be efficiently utilized for building decision support systems in the clinic, only if they are correctly interpreted automatically in accordance with a specified standard. This paper tackles the problem of developing an automated method of labeling free-form radiology reports, as a precursor for building query-capable report databases in hospitals. The analyzed dataset consists of 1295 radiology reports concerning the condition of a knee, retrospectively gathered at the Clinical Hospital Centre Rijeka, Croatia. Reports were manually labeled with one or more labels from a set of 10 most commonly occurring clinical conditions. After primary preprocessing of the texts, two sets of text classification methods were compared: (1) traditional classification models—Naive Bayes (NB), Logistic Regression (LR), Support Vector Machine (SVM), and Random Forests (RF)—coupled with Bag-of-Words (BoW) features (i.e., symbolic text representation) and (2) Convolutional Neural Network (CNN) coupled with dense word vectors (i.e., word embeddings as a semantic text representation) as input features. We resorted to nested 10-fold cross-validation to evaluate the performance of competing methods using accuracy, precision, recall, and [Formula: see text] score. The CNN with semantic word representations as input yielded the overall best performance, having a micro-averaged [Formula: see text] score of [Formula: see text]. The CNN classifier yielded particularly encouraging results for the most represented conditions: degenerative disease ([Formula: see text]), arthrosis ([Formula: see text]), and injury ([Formula: see text]). As a data-hungry deep learning model, the CNN, however, performed notably worse than the competing models on underrepresented classes with fewer training instances such as multicausal disease or metabolic disease. LR, RF, and SVM performed comparably well, with the obtained micro-averaged [Formula: see text] scores of [Formula: see text] , [Formula: see text] , and [Formula: see text] , respectively. MDPI 2020-04-01 /pmc/articles/PMC7235892/ /pubmed/32244833 http://dx.doi.org/10.3390/diagnostics10040196 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Krsnik, Ivan Glavaš, Goran Krsnik, Marina Miletić, Damir Štajduhar, Ivan Automatic Annotation of Narrative Radiology Reports
title	Automatic Annotation of Narrative Radiology Reports
title_full	Automatic Annotation of Narrative Radiology Reports
title_fullStr	Automatic Annotation of Narrative Radiology Reports
title_full_unstemmed	Automatic Annotation of Narrative Radiology Reports
title_short	Automatic Annotation of Narrative Radiology Reports
title_sort	automatic annotation of narrative radiology reports
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7235892/ https://www.ncbi.nlm.nih.gov/pubmed/32244833 http://dx.doi.org/10.3390/diagnostics10040196
work_keys_str_mv	AT krsnikivan automaticannotationofnarrativeradiologyreports AT glavasgoran automaticannotationofnarrativeradiologyreports AT krsnikmarina automaticannotationofnarrativeradiologyreports AT mileticdamir automaticannotationofnarrativeradiologyreports AT stajduharivan automaticannotationofnarrativeradiologyreports

Automatic Annotation of Narrative Radiology Reports

Ejemplares similares