Cargando…

Automatic extraction of imaging observation and assessment categories from breast magnetic resonance imaging reports with natural language processing

BACKGROUND: Structured reports are not widely used and thus most reports exist in the form of free text. The process of data extraction by experts is time-consuming and error-prone, whereas data extraction by natural language processing (NLP) is a potential solution that could improve diagnosis effi...

Descripción completa

Detalles Bibliográficos
Autores principales:	Liu, Yi, Zhu, Li-Na, Liu, Qing, Han, Chao, Zhang, Xiao-Dong, Wang, Xiao-Ying
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Wolters Kluwer Health 2019
Materias:	Original Articles
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6759110/ https://www.ncbi.nlm.nih.gov/pubmed/31268905 http://dx.doi.org/10.1097/CM9.0000000000000301

_version_	1783453636121067520
author	Liu, Yi Zhu, Li-Na Liu, Qing Han, Chao Zhang, Xiao-Dong Wang, Xiao-Ying
author_facet	Liu, Yi Zhu, Li-Na Liu, Qing Han, Chao Zhang, Xiao-Dong Wang, Xiao-Ying
author_sort	Liu, Yi
collection	PubMed
description	BACKGROUND: Structured reports are not widely used and thus most reports exist in the form of free text. The process of data extraction by experts is time-consuming and error-prone, whereas data extraction by natural language processing (NLP) is a potential solution that could improve diagnosis efficiency and accuracy. The purpose of this study was to evaluate an NLP program that determines American College of Radiology Breast Imaging Reporting and Data System (BI-RADS) descriptors and final assessment categories from breast magnetic resonance imaging (MRI) reports. METHODS: This cross-sectional study involved 2330 breast MRI reports in the electronic medical record from 2009 to 2017. We used 1635 reports for the creation of a revised BI-RADS MRI lexicon and synonyms lists as well as the iterative development of an NLP system. The remaining 695 reports that were not used for developing the system were used as an independent test set for the final evaluation of the NLP system. The recall and precision of an NLP algorithm to detect the revised BI-RADS MRI descriptors and BI-RADS categories from the free-text reports were evaluated against a standard reference of manual human review. RESULTS: There was a high level of agreement between two manual reviewers, with a κ value of 0.95. For all breast imaging reports, the NLP algorithm demonstrated a recall of 78.5% and a precision of 86.1% for correct identification of the revised BI-RADS MRI descriptors and the BI-RADS categories. NLP generated the total results in <1 s, whereas the manual reviewers averaged 3.38 and 3.23 min per report, respectively. CONCLUSIONS: The NLP algorithm demonstrates high recall and precision for information extraction from free-text reports. This approach will help to narrow the gap between unstructured report text and structured data, which is needed in decision support and other applications.
format	Online Article Text
id	pubmed-6759110
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Wolters Kluwer Health
record_format	MEDLINE/PubMed
spelling	pubmed-67591102019-10-07 Automatic extraction of imaging observation and assessment categories from breast magnetic resonance imaging reports with natural language processing Liu, Yi Zhu, Li-Na Liu, Qing Han, Chao Zhang, Xiao-Dong Wang, Xiao-Ying Chin Med J (Engl) Original Articles BACKGROUND: Structured reports are not widely used and thus most reports exist in the form of free text. The process of data extraction by experts is time-consuming and error-prone, whereas data extraction by natural language processing (NLP) is a potential solution that could improve diagnosis efficiency and accuracy. The purpose of this study was to evaluate an NLP program that determines American College of Radiology Breast Imaging Reporting and Data System (BI-RADS) descriptors and final assessment categories from breast magnetic resonance imaging (MRI) reports. METHODS: This cross-sectional study involved 2330 breast MRI reports in the electronic medical record from 2009 to 2017. We used 1635 reports for the creation of a revised BI-RADS MRI lexicon and synonyms lists as well as the iterative development of an NLP system. The remaining 695 reports that were not used for developing the system were used as an independent test set for the final evaluation of the NLP system. The recall and precision of an NLP algorithm to detect the revised BI-RADS MRI descriptors and BI-RADS categories from the free-text reports were evaluated against a standard reference of manual human review. RESULTS: There was a high level of agreement between two manual reviewers, with a κ value of 0.95. For all breast imaging reports, the NLP algorithm demonstrated a recall of 78.5% and a precision of 86.1% for correct identification of the revised BI-RADS MRI descriptors and the BI-RADS categories. NLP generated the total results in <1 s, whereas the manual reviewers averaged 3.38 and 3.23 min per report, respectively. CONCLUSIONS: The NLP algorithm demonstrates high recall and precision for information extraction from free-text reports. This approach will help to narrow the gap between unstructured report text and structured data, which is needed in decision support and other applications. Wolters Kluwer Health 2019-07-20 2019-07-20 /pmc/articles/PMC6759110/ /pubmed/31268905 http://dx.doi.org/10.1097/CM9.0000000000000301 Text en Copyright © 2019 The Chinese Medical Association, produced by Wolters Kluwer, Inc. under the CC-BY-NC-ND license. http://creativecommons.org/licenses/by-nc-nd/4.0 This is an open access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal. http://creativecommons.org/licenses/by-nc-nd/4.0
spellingShingle	Original Articles Liu, Yi Zhu, Li-Na Liu, Qing Han, Chao Zhang, Xiao-Dong Wang, Xiao-Ying Automatic extraction of imaging observation and assessment categories from breast magnetic resonance imaging reports with natural language processing
title	Automatic extraction of imaging observation and assessment categories from breast magnetic resonance imaging reports with natural language processing
title_full	Automatic extraction of imaging observation and assessment categories from breast magnetic resonance imaging reports with natural language processing
title_fullStr	Automatic extraction of imaging observation and assessment categories from breast magnetic resonance imaging reports with natural language processing
title_full_unstemmed	Automatic extraction of imaging observation and assessment categories from breast magnetic resonance imaging reports with natural language processing
title_short	Automatic extraction of imaging observation and assessment categories from breast magnetic resonance imaging reports with natural language processing
title_sort	automatic extraction of imaging observation and assessment categories from breast magnetic resonance imaging reports with natural language processing
topic	Original Articles
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6759110/ https://www.ncbi.nlm.nih.gov/pubmed/31268905 http://dx.doi.org/10.1097/CM9.0000000000000301
work_keys_str_mv	AT liuyi automaticextractionofimagingobservationandassessmentcategoriesfrombreastmagneticresonanceimagingreportswithnaturallanguageprocessing AT zhulina automaticextractionofimagingobservationandassessmentcategoriesfrombreastmagneticresonanceimagingreportswithnaturallanguageprocessing AT liuqing automaticextractionofimagingobservationandassessmentcategoriesfrombreastmagneticresonanceimagingreportswithnaturallanguageprocessing AT hanchao automaticextractionofimagingobservationandassessmentcategoriesfrombreastmagneticresonanceimagingreportswithnaturallanguageprocessing AT zhangxiaodong automaticextractionofimagingobservationandassessmentcategoriesfrombreastmagneticresonanceimagingreportswithnaturallanguageprocessing AT wangxiaoying automaticextractionofimagingobservationandassessmentcategoriesfrombreastmagneticresonanceimagingreportswithnaturallanguageprocessing

Automatic extraction of imaging observation and assessment categories from breast magnetic resonance imaging reports with natural language processing

Ejemplares similares