Cargando…

The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports

BACKGROUND: There are often multiple lesions in breast magnetic resonance imaging (MRI) reports and radiologists usually focus on describing the index lesion that is most crucial to clinicians in determining the management and prognosis of patients. Natural language processing (NLP) has been used fo...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yi, Liu, Qing, Han, Chao, Zhang, Xiaodong, Wang, Xiaoying
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6937920/
https://www.ncbi.nlm.nih.gov/pubmed/31888615
http://dx.doi.org/10.1186/s12911-019-0997-3
_version_ 1783483966893850624
author Liu, Yi
Liu, Qing
Han, Chao
Zhang, Xiaodong
Wang, Xiaoying
author_facet Liu, Yi
Liu, Qing
Han, Chao
Zhang, Xiaodong
Wang, Xiaoying
author_sort Liu, Yi
collection PubMed
description BACKGROUND: There are often multiple lesions in breast magnetic resonance imaging (MRI) reports and radiologists usually focus on describing the index lesion that is most crucial to clinicians in determining the management and prognosis of patients. Natural language processing (NLP) has been used for information extraction from mammography reports. However, few studies have investigated NLP in breast MRI data based on free-form text. The objective of the current study was to assess the validity of our NLP program to accurately extract index lesions and their corresponding imaging features from free-form text of breast MRI reports. METHODS: This cross-sectional study examined 1633 free-form text reports of breast MRIs from 2014 to 2017. First, the NLP system was used to extract 9 features from all the lesions in the reports according to the Breast Imaging Reporting and Data System (BI-RADS) descriptors. Second, the index lesion was defined as the lesion with the largest number of imaging features. Third, we extracted the values of each imaging feature and the BI-RADS category from each index lesion. To evaluate the accuracy of our system, 478 reports were manually reviewed by two individuals. The time taken to extract data by NLP was compared with that by reviewers. RESULTS: The NLP system extracted 889 lesions from 478 reports. The mean number of imaging features per lesion was 6.5 ± 2.1 (range: 3–9; 95% CI: 6.362–6.638). The mean number of imaging features per index lesion was 8.0 ± 1.1 (range: 5–9; 95% CI: 7.901–8.099). The NLP system demonstrated a recall of 100.0% and a precision of 99.6% for correct identification of the index lesion. The recall and precision of NLP to correctly extract the value of imaging features from the index lesions were 91.0 and 92.6%, respectively. The recall and precision for the correct identification of the BI-RADS categories were 96.6 and 94.8%, respectively. NLP generated the total results in less than 1 s, whereas the manual reviewers averaged 4.47 min and 4.56 min per report. CONCLUSIONS: Our NLP method successfully extracted the index lesion and its corresponding information from free-form text.
format Online
Article
Text
id pubmed-6937920
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69379202019-12-31 The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports Liu, Yi Liu, Qing Han, Chao Zhang, Xiaodong Wang, Xiaoying BMC Med Inform Decis Mak Research Article BACKGROUND: There are often multiple lesions in breast magnetic resonance imaging (MRI) reports and radiologists usually focus on describing the index lesion that is most crucial to clinicians in determining the management and prognosis of patients. Natural language processing (NLP) has been used for information extraction from mammography reports. However, few studies have investigated NLP in breast MRI data based on free-form text. The objective of the current study was to assess the validity of our NLP program to accurately extract index lesions and their corresponding imaging features from free-form text of breast MRI reports. METHODS: This cross-sectional study examined 1633 free-form text reports of breast MRIs from 2014 to 2017. First, the NLP system was used to extract 9 features from all the lesions in the reports according to the Breast Imaging Reporting and Data System (BI-RADS) descriptors. Second, the index lesion was defined as the lesion with the largest number of imaging features. Third, we extracted the values of each imaging feature and the BI-RADS category from each index lesion. To evaluate the accuracy of our system, 478 reports were manually reviewed by two individuals. The time taken to extract data by NLP was compared with that by reviewers. RESULTS: The NLP system extracted 889 lesions from 478 reports. The mean number of imaging features per lesion was 6.5 ± 2.1 (range: 3–9; 95% CI: 6.362–6.638). The mean number of imaging features per index lesion was 8.0 ± 1.1 (range: 5–9; 95% CI: 7.901–8.099). The NLP system demonstrated a recall of 100.0% and a precision of 99.6% for correct identification of the index lesion. The recall and precision of NLP to correctly extract the value of imaging features from the index lesions were 91.0 and 92.6%, respectively. The recall and precision for the correct identification of the BI-RADS categories were 96.6 and 94.8%, respectively. NLP generated the total results in less than 1 s, whereas the manual reviewers averaged 4.47 min and 4.56 min per report. CONCLUSIONS: Our NLP method successfully extracted the index lesion and its corresponding information from free-form text. BioMed Central 2019-12-30 /pmc/articles/PMC6937920/ /pubmed/31888615 http://dx.doi.org/10.1186/s12911-019-0997-3 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Liu, Yi
Liu, Qing
Han, Chao
Zhang, Xiaodong
Wang, Xiaoying
The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports
title The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports
title_full The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports
title_fullStr The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports
title_full_unstemmed The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports
title_short The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports
title_sort implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6937920/
https://www.ncbi.nlm.nih.gov/pubmed/31888615
http://dx.doi.org/10.1186/s12911-019-0997-3
work_keys_str_mv AT liuyi theimplementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports
AT liuqing theimplementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports
AT hanchao theimplementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports
AT zhangxiaodong theimplementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports
AT wangxiaoying theimplementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports
AT liuyi implementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports
AT liuqing implementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports
AT hanchao implementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports
AT zhangxiaodong implementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports
AT wangxiaoying implementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports