Cargando…
The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports
BACKGROUND: There are often multiple lesions in breast magnetic resonance imaging (MRI) reports and radiologists usually focus on describing the index lesion that is most crucial to clinicians in determining the management and prognosis of patients. Natural language processing (NLP) has been used fo...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6937920/ https://www.ncbi.nlm.nih.gov/pubmed/31888615 http://dx.doi.org/10.1186/s12911-019-0997-3 |
_version_ | 1783483966893850624 |
---|---|
author | Liu, Yi Liu, Qing Han, Chao Zhang, Xiaodong Wang, Xiaoying |
author_facet | Liu, Yi Liu, Qing Han, Chao Zhang, Xiaodong Wang, Xiaoying |
author_sort | Liu, Yi |
collection | PubMed |
description | BACKGROUND: There are often multiple lesions in breast magnetic resonance imaging (MRI) reports and radiologists usually focus on describing the index lesion that is most crucial to clinicians in determining the management and prognosis of patients. Natural language processing (NLP) has been used for information extraction from mammography reports. However, few studies have investigated NLP in breast MRI data based on free-form text. The objective of the current study was to assess the validity of our NLP program to accurately extract index lesions and their corresponding imaging features from free-form text of breast MRI reports. METHODS: This cross-sectional study examined 1633 free-form text reports of breast MRIs from 2014 to 2017. First, the NLP system was used to extract 9 features from all the lesions in the reports according to the Breast Imaging Reporting and Data System (BI-RADS) descriptors. Second, the index lesion was defined as the lesion with the largest number of imaging features. Third, we extracted the values of each imaging feature and the BI-RADS category from each index lesion. To evaluate the accuracy of our system, 478 reports were manually reviewed by two individuals. The time taken to extract data by NLP was compared with that by reviewers. RESULTS: The NLP system extracted 889 lesions from 478 reports. The mean number of imaging features per lesion was 6.5 ± 2.1 (range: 3–9; 95% CI: 6.362–6.638). The mean number of imaging features per index lesion was 8.0 ± 1.1 (range: 5–9; 95% CI: 7.901–8.099). The NLP system demonstrated a recall of 100.0% and a precision of 99.6% for correct identification of the index lesion. The recall and precision of NLP to correctly extract the value of imaging features from the index lesions were 91.0 and 92.6%, respectively. The recall and precision for the correct identification of the BI-RADS categories were 96.6 and 94.8%, respectively. NLP generated the total results in less than 1 s, whereas the manual reviewers averaged 4.47 min and 4.56 min per report. CONCLUSIONS: Our NLP method successfully extracted the index lesion and its corresponding information from free-form text. |
format | Online Article Text |
id | pubmed-6937920 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-69379202019-12-31 The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports Liu, Yi Liu, Qing Han, Chao Zhang, Xiaodong Wang, Xiaoying BMC Med Inform Decis Mak Research Article BACKGROUND: There are often multiple lesions in breast magnetic resonance imaging (MRI) reports and radiologists usually focus on describing the index lesion that is most crucial to clinicians in determining the management and prognosis of patients. Natural language processing (NLP) has been used for information extraction from mammography reports. However, few studies have investigated NLP in breast MRI data based on free-form text. The objective of the current study was to assess the validity of our NLP program to accurately extract index lesions and their corresponding imaging features from free-form text of breast MRI reports. METHODS: This cross-sectional study examined 1633 free-form text reports of breast MRIs from 2014 to 2017. First, the NLP system was used to extract 9 features from all the lesions in the reports according to the Breast Imaging Reporting and Data System (BI-RADS) descriptors. Second, the index lesion was defined as the lesion with the largest number of imaging features. Third, we extracted the values of each imaging feature and the BI-RADS category from each index lesion. To evaluate the accuracy of our system, 478 reports were manually reviewed by two individuals. The time taken to extract data by NLP was compared with that by reviewers. RESULTS: The NLP system extracted 889 lesions from 478 reports. The mean number of imaging features per lesion was 6.5 ± 2.1 (range: 3–9; 95% CI: 6.362–6.638). The mean number of imaging features per index lesion was 8.0 ± 1.1 (range: 5–9; 95% CI: 7.901–8.099). The NLP system demonstrated a recall of 100.0% and a precision of 99.6% for correct identification of the index lesion. The recall and precision of NLP to correctly extract the value of imaging features from the index lesions were 91.0 and 92.6%, respectively. The recall and precision for the correct identification of the BI-RADS categories were 96.6 and 94.8%, respectively. NLP generated the total results in less than 1 s, whereas the manual reviewers averaged 4.47 min and 4.56 min per report. CONCLUSIONS: Our NLP method successfully extracted the index lesion and its corresponding information from free-form text. BioMed Central 2019-12-30 /pmc/articles/PMC6937920/ /pubmed/31888615 http://dx.doi.org/10.1186/s12911-019-0997-3 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Liu, Yi Liu, Qing Han, Chao Zhang, Xiaodong Wang, Xiaoying The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports |
title | The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports |
title_full | The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports |
title_fullStr | The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports |
title_full_unstemmed | The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports |
title_short | The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports |
title_sort | implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6937920/ https://www.ncbi.nlm.nih.gov/pubmed/31888615 http://dx.doi.org/10.1186/s12911-019-0997-3 |
work_keys_str_mv | AT liuyi theimplementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports AT liuqing theimplementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports AT hanchao theimplementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports AT zhangxiaodong theimplementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports AT wangxiaoying theimplementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports AT liuyi implementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports AT liuqing implementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports AT hanchao implementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports AT zhangxiaodong implementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports AT wangxiaoying implementationofnaturallanguageprocessingtoextractindexlesionsfrombreastmagneticresonanceimagingreports |