Cargando…

Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study

BACKGROUND: Diagnostic neurovascular imaging data are important in stroke research, but obtaining these data typically requires laborious manual chart reviews. OBJECTIVE: We aimed to determine the accuracy of a natural language processing (NLP) approach to extract information on the presence and loc...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yu, Amy Y X, Liu, Zhongyu A, Pou-Prom, Chloe, Lopes, Kaitlyn, Kapral, Moira K, Aviv, Richard I, Mamdani, Muhammad
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2021
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8132979/ https://www.ncbi.nlm.nih.gov/pubmed/33944791 http://dx.doi.org/10.2196/24381

_version_	1783694998956408832
author	Yu, Amy Y X Liu, Zhongyu A Pou-Prom, Chloe Lopes, Kaitlyn Kapral, Moira K Aviv, Richard I Mamdani, Muhammad
author_facet	Yu, Amy Y X Liu, Zhongyu A Pou-Prom, Chloe Lopes, Kaitlyn Kapral, Moira K Aviv, Richard I Mamdani, Muhammad
author_sort	Yu, Amy Y X
collection	PubMed
description	BACKGROUND: Diagnostic neurovascular imaging data are important in stroke research, but obtaining these data typically requires laborious manual chart reviews. OBJECTIVE: We aimed to determine the accuracy of a natural language processing (NLP) approach to extract information on the presence and location of vascular occlusions as well as other stroke-related attributes based on free-text reports. METHODS: From the full reports of 1320 consecutive computed tomography (CT), CT angiography, and CT perfusion scans of the head and neck performed at a tertiary stroke center between October 2017 and January 2019, we manually extracted data on the presence of proximal large vessel occlusion (primary outcome), as well as distal vessel occlusion, ischemia, hemorrhage, Alberta stroke program early CT score (ASPECTS), and collateral status (secondary outcomes). Reports were randomly split into training (n=921) and validation (n=399) sets, and attributes were extracted using rule-based NLP. We reported the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and the overall accuracy of the NLP approach relative to the manually extracted data. RESULTS: The overall prevalence of large vessel occlusion was 12.2%. In the training sample, the NLP approach identified this attribute with an overall accuracy of 97.3% (95.5% sensitivity, 98.1% specificity, 84.1% PPV, and 99.4% NPV). In the validation set, the overall accuracy was 95.2% (90.0% sensitivity, 97.4% specificity, 76.3% PPV, and 98.5% NPV). The accuracy of identifying distal or basilar occlusion as well as hemorrhage was also high, but there were limitations in identifying cerebral ischemia, ASPECTS, and collateral status. CONCLUSIONS: NLP may improve the efficiency of large-scale imaging data collection for stroke surveillance and research.
format	Online Article Text
id	pubmed-8132979
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-81329792021-05-24 Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study Yu, Amy Y X Liu, Zhongyu A Pou-Prom, Chloe Lopes, Kaitlyn Kapral, Moira K Aviv, Richard I Mamdani, Muhammad JMIR Med Inform Original Paper BACKGROUND: Diagnostic neurovascular imaging data are important in stroke research, but obtaining these data typically requires laborious manual chart reviews. OBJECTIVE: We aimed to determine the accuracy of a natural language processing (NLP) approach to extract information on the presence and location of vascular occlusions as well as other stroke-related attributes based on free-text reports. METHODS: From the full reports of 1320 consecutive computed tomography (CT), CT angiography, and CT perfusion scans of the head and neck performed at a tertiary stroke center between October 2017 and January 2019, we manually extracted data on the presence of proximal large vessel occlusion (primary outcome), as well as distal vessel occlusion, ischemia, hemorrhage, Alberta stroke program early CT score (ASPECTS), and collateral status (secondary outcomes). Reports were randomly split into training (n=921) and validation (n=399) sets, and attributes were extracted using rule-based NLP. We reported the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and the overall accuracy of the NLP approach relative to the manually extracted data. RESULTS: The overall prevalence of large vessel occlusion was 12.2%. In the training sample, the NLP approach identified this attribute with an overall accuracy of 97.3% (95.5% sensitivity, 98.1% specificity, 84.1% PPV, and 99.4% NPV). In the validation set, the overall accuracy was 95.2% (90.0% sensitivity, 97.4% specificity, 76.3% PPV, and 98.5% NPV). The accuracy of identifying distal or basilar occlusion as well as hemorrhage was also high, but there were limitations in identifying cerebral ischemia, ASPECTS, and collateral status. CONCLUSIONS: NLP may improve the efficiency of large-scale imaging data collection for stroke surveillance and research. JMIR Publications 2021-05-04 /pmc/articles/PMC8132979/ /pubmed/33944791 http://dx.doi.org/10.2196/24381 Text en ©Amy Y X Yu, Zhongyu A Liu, Chloe Pou-Prom, Kaitlyn Lopes, Moira K Kapral, Richard I Aviv, Muhammad Mamdani. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 04.05.2021. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Original Paper Yu, Amy Y X Liu, Zhongyu A Pou-Prom, Chloe Lopes, Kaitlyn Kapral, Moira K Aviv, Richard I Mamdani, Muhammad Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study
title	Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study
title_full	Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study
title_fullStr	Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study
title_full_unstemmed	Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study
title_short	Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study
title_sort	automating stroke data extraction from free-text radiology reports using natural language processing: instrument validation study
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8132979/ https://www.ncbi.nlm.nih.gov/pubmed/33944791 http://dx.doi.org/10.2196/24381
work_keys_str_mv	AT yuamyyx automatingstrokedataextractionfromfreetextradiologyreportsusingnaturallanguageprocessinginstrumentvalidationstudy AT liuzhongyua automatingstrokedataextractionfromfreetextradiologyreportsusingnaturallanguageprocessinginstrumentvalidationstudy AT poupromchloe automatingstrokedataextractionfromfreetextradiologyreportsusingnaturallanguageprocessinginstrumentvalidationstudy AT lopeskaitlyn automatingstrokedataextractionfromfreetextradiologyreportsusingnaturallanguageprocessinginstrumentvalidationstudy AT kapralmoirak automatingstrokedataextractionfromfreetextradiologyreportsusingnaturallanguageprocessinginstrumentvalidationstudy AT avivrichardi automatingstrokedataextractionfromfreetextradiologyreportsusingnaturallanguageprocessinginstrumentvalidationstudy AT mamdanimuhammad automatingstrokedataextractionfromfreetextradiologyreportsusingnaturallanguageprocessinginstrumentvalidationstudy

Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study

Ejemplares similares