Cargando…

Automated Detection of Measurements and Their Descriptors in Radiology Reports Using a Hybrid Natural Language Processing Algorithm

Radiological measurements are reported in free text reports, and it is challenging to extract such measures for treatment planning such as lesion summarization and cancer response assessment. The purpose of this work is to develop and evaluate a natural language processing (NLP) pipeline that can ex...

Descripción completa

Detalles Bibliográficos
Autores principales: Bozkurt, Selen, Alkim, Emel, Banerjee, Imon, Rubin, Daniel L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6646482/
https://www.ncbi.nlm.nih.gov/pubmed/31222557
http://dx.doi.org/10.1007/s10278-019-00237-9
_version_ 1783437568177602560
author Bozkurt, Selen
Alkim, Emel
Banerjee, Imon
Rubin, Daniel L.
author_facet Bozkurt, Selen
Alkim, Emel
Banerjee, Imon
Rubin, Daniel L.
author_sort Bozkurt, Selen
collection PubMed
description Radiological measurements are reported in free text reports, and it is challenging to extract such measures for treatment planning such as lesion summarization and cancer response assessment. The purpose of this work is to develop and evaluate a natural language processing (NLP) pipeline that can extract measurements and their core descriptors, such as temporality, anatomical entity, imaging observation, RadLex descriptors, series number, image number, and segment from a wide variety of radiology reports (MR, CT, and mammogram). We created a hybrid NLP pipeline that integrates rule-based feature extraction modules and conditional random field (CRF) model for extraction of the measurements from the radiology reports and links them with clinically relevant features such as anatomical entities or imaging observations. The pipeline was trained on 1117 CT/MR reports, and performance of the system was evaluated on an independent set of 100 expert-annotated CT/MR reports and also tested on 25 mammography reports. The system detected 813 out of 806 measurements in the CT/MR reports; 784 were true positives, 29 were false positives, and 0 were false negatives. Similarly, from the mammography reports, 96% of the measurements with their modifiers were extracted correctly. Our approach could enable the development of computerized applications that can utilize summarized lesion measurements from radiology report of varying modalities and improve practice by tracking the same lesions along multiple radiologic encounters.
format Online
Article
Text
id pubmed-6646482
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-66464822019-08-06 Automated Detection of Measurements and Their Descriptors in Radiology Reports Using a Hybrid Natural Language Processing Algorithm Bozkurt, Selen Alkim, Emel Banerjee, Imon Rubin, Daniel L. J Digit Imaging Original Paper Radiological measurements are reported in free text reports, and it is challenging to extract such measures for treatment planning such as lesion summarization and cancer response assessment. The purpose of this work is to develop and evaluate a natural language processing (NLP) pipeline that can extract measurements and their core descriptors, such as temporality, anatomical entity, imaging observation, RadLex descriptors, series number, image number, and segment from a wide variety of radiology reports (MR, CT, and mammogram). We created a hybrid NLP pipeline that integrates rule-based feature extraction modules and conditional random field (CRF) model for extraction of the measurements from the radiology reports and links them with clinically relevant features such as anatomical entities or imaging observations. The pipeline was trained on 1117 CT/MR reports, and performance of the system was evaluated on an independent set of 100 expert-annotated CT/MR reports and also tested on 25 mammography reports. The system detected 813 out of 806 measurements in the CT/MR reports; 784 were true positives, 29 were false positives, and 0 were false negatives. Similarly, from the mammography reports, 96% of the measurements with their modifiers were extracted correctly. Our approach could enable the development of computerized applications that can utilize summarized lesion measurements from radiology report of varying modalities and improve practice by tracking the same lesions along multiple radiologic encounters. Springer International Publishing 2019-06-20 2019-08 /pmc/articles/PMC6646482/ /pubmed/31222557 http://dx.doi.org/10.1007/s10278-019-00237-9 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle Original Paper
Bozkurt, Selen
Alkim, Emel
Banerjee, Imon
Rubin, Daniel L.
Automated Detection of Measurements and Their Descriptors in Radiology Reports Using a Hybrid Natural Language Processing Algorithm
title Automated Detection of Measurements and Their Descriptors in Radiology Reports Using a Hybrid Natural Language Processing Algorithm
title_full Automated Detection of Measurements and Their Descriptors in Radiology Reports Using a Hybrid Natural Language Processing Algorithm
title_fullStr Automated Detection of Measurements and Their Descriptors in Radiology Reports Using a Hybrid Natural Language Processing Algorithm
title_full_unstemmed Automated Detection of Measurements and Their Descriptors in Radiology Reports Using a Hybrid Natural Language Processing Algorithm
title_short Automated Detection of Measurements and Their Descriptors in Radiology Reports Using a Hybrid Natural Language Processing Algorithm
title_sort automated detection of measurements and their descriptors in radiology reports using a hybrid natural language processing algorithm
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6646482/
https://www.ncbi.nlm.nih.gov/pubmed/31222557
http://dx.doi.org/10.1007/s10278-019-00237-9
work_keys_str_mv AT bozkurtselen automateddetectionofmeasurementsandtheirdescriptorsinradiologyreportsusingahybridnaturallanguageprocessingalgorithm
AT alkimemel automateddetectionofmeasurementsandtheirdescriptorsinradiologyreportsusingahybridnaturallanguageprocessingalgorithm
AT banerjeeimon automateddetectionofmeasurementsandtheirdescriptorsinradiologyreportsusingahybridnaturallanguageprocessingalgorithm
AT rubindaniell automateddetectionofmeasurementsandtheirdescriptorsinradiologyreportsusingahybridnaturallanguageprocessingalgorithm