Cargando…

Neural Network Assisted Pathology Case Identification

BACKGROUND: Traditionally, cases for cohort selection and quality assurance purposes are identified through structured query language (SQL) searches matching specific keywords. Recently, several neural network-based natural language processing (NLP) pipelines have emerged as an accurate alternative/...

Descripción completa

Detalles Bibliográficos
Autor principal:	Cheng, Jerome
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier 2022
Materias:	Short Communication
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8860736/ https://www.ncbi.nlm.nih.gov/pubmed/35242447 http://dx.doi.org/10.1016/j.jpi.2022.100008

_version_	1784654738209374208
author	Cheng, Jerome
author_facet	Cheng, Jerome
author_sort	Cheng, Jerome
collection	PubMed
description	BACKGROUND: Traditionally, cases for cohort selection and quality assurance purposes are identified through structured query language (SQL) searches matching specific keywords. Recently, several neural network-based natural language processing (NLP) pipelines have emerged as an accurate alternative/complementary method for case retrieval. METHODS: The diagnosis section of 1000 pathology reports with the terms “colon” and “carcinoma” were retrieved from our laboratory information system through a SQL query. Each of the reports were labeled as either positive or negative, where cases are considered positive if the case was a primary adenocarcinoma of the colon. Negative cases comprised adenocarcinoma from other sites, metastatic adenocarcinomas, benign conditions, rectal cancers, and other cases that do not fit in the primary colonic adenocarcinoma category. The 1000 cases were randomly separated into training, validation, and holdout sets. A convolutional neural network (CNN) model built using Keras (a neural network library) was trained to identify positive cases, and the model was applied to the holdout set to predict the category for each case. RESULTS: The CNN model classified 141 out of 149 primary colonic adenocarcinoma cases, and 43 out of 51 negative cases correctly, achieving an accuracy of 92% and area under the ROC curve (AUC) of 0.957. CONCLUSION: Trained convolutional neural network models by itself, or as an adjunct to keyword and pattern-based text extraction methods may be used to search for pathology cases of interest with high accuracy.
format	Online Article Text
id	pubmed-8860736
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Elsevier
record_format	MEDLINE/PubMed
spelling	pubmed-88607362022-03-02 Neural Network Assisted Pathology Case Identification Cheng, Jerome J Pathol Inform Short Communication BACKGROUND: Traditionally, cases for cohort selection and quality assurance purposes are identified through structured query language (SQL) searches matching specific keywords. Recently, several neural network-based natural language processing (NLP) pipelines have emerged as an accurate alternative/complementary method for case retrieval. METHODS: The diagnosis section of 1000 pathology reports with the terms “colon” and “carcinoma” were retrieved from our laboratory information system through a SQL query. Each of the reports were labeled as either positive or negative, where cases are considered positive if the case was a primary adenocarcinoma of the colon. Negative cases comprised adenocarcinoma from other sites, metastatic adenocarcinomas, benign conditions, rectal cancers, and other cases that do not fit in the primary colonic adenocarcinoma category. The 1000 cases were randomly separated into training, validation, and holdout sets. A convolutional neural network (CNN) model built using Keras (a neural network library) was trained to identify positive cases, and the model was applied to the holdout set to predict the category for each case. RESULTS: The CNN model classified 141 out of 149 primary colonic adenocarcinoma cases, and 43 out of 51 negative cases correctly, achieving an accuracy of 92% and area under the ROC curve (AUC) of 0.957. CONCLUSION: Trained convolutional neural network models by itself, or as an adjunct to keyword and pattern-based text extraction methods may be used to search for pathology cases of interest with high accuracy. Elsevier 2022-01-20 /pmc/articles/PMC8860736/ /pubmed/35242447 http://dx.doi.org/10.1016/j.jpi.2022.100008 Text en © 2022 The Author https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Short Communication Cheng, Jerome Neural Network Assisted Pathology Case Identification
title	Neural Network Assisted Pathology Case Identification
title_full	Neural Network Assisted Pathology Case Identification
title_fullStr	Neural Network Assisted Pathology Case Identification
title_full_unstemmed	Neural Network Assisted Pathology Case Identification
title_short	Neural Network Assisted Pathology Case Identification
title_sort	neural network assisted pathology case identification
topic	Short Communication
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8860736/ https://www.ncbi.nlm.nih.gov/pubmed/35242447 http://dx.doi.org/10.1016/j.jpi.2022.100008
work_keys_str_mv	AT chengjerome neuralnetworkassistedpathologycaseidentification

Neural Network Assisted Pathology Case Identification

Ejemplares similares