Cargando…

Detection of medical text semantic similarity based on convolutional neural network

BACKGROUND: Imaging examinations, such as ultrasonography, magnetic resonance imaging and computed tomography scans, play key roles in healthcare settings. To assess and improve the quality of imaging diagnosis, we need to manually find and compare the pre-existing reports of imaging and pathology e...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zheng, Tao, Gao, Yimei, Wang, Fei, Fan, Chenhao, Fu, Xingzhi, Li, Mei, Zhang, Ya, Zhang, Shaodian, Ma, Handong
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2019
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6686478/ https://www.ncbi.nlm.nih.gov/pubmed/31391038 http://dx.doi.org/10.1186/s12911-019-0880-2

_version_	1783442575536947200
author	Zheng, Tao Gao, Yimei Wang, Fei Fan, Chenhao Fu, Xingzhi Li, Mei Zhang, Ya Zhang, Shaodian Ma, Handong
author_facet	Zheng, Tao Gao, Yimei Wang, Fei Fan, Chenhao Fu, Xingzhi Li, Mei Zhang, Ya Zhang, Shaodian Ma, Handong
author_sort	Zheng, Tao
collection	PubMed
description	BACKGROUND: Imaging examinations, such as ultrasonography, magnetic resonance imaging and computed tomography scans, play key roles in healthcare settings. To assess and improve the quality of imaging diagnosis, we need to manually find and compare the pre-existing reports of imaging and pathology examinations which contain overlapping exam body sites from electrical medical records (EMRs). The process of retrieving those reports is time-consuming. In this paper, we propose a convolutional neural network (CNN) based method which can better utilize semantic information contained in report texts to accelerate the retrieving process. METHODS: We included 16,354 imaging and pathology report-pairs from 1926 patients who admitted to Shanghai Tongren Hospital and had ultrasonic examinations between 1st May 2017 and 31st July 2017. We adapted the CNN model to calculate the similarities among the report-pairs to identify target report-pairs with overlapping body sites, and compared the performance with other six conventional models, including keyword mapping, latent semantic analysis (LSA), latent Dirichlet allocation (LDA), Doc2Vec, Siamese long short term memory (LSTM) and a model based on named entity recognition (NER). We also utilized graph embedding method to enhance the word representation by capturing the semantic relations information from medical ontologies. Additionally, we used LIME algorithm to identify which features (or words) are decisive for the prediction results and improved the model interpretability. RESULTS: Experiment results showed that our CNN model gained significant improvement compared to all other conventional models on area under the receiver operating characteristic (AUROC), precision, recall and F1-score in our test dataset. The AUROC of our CNN models gained approximately 3–7% improvement. The AUROC of CNN model with graph-embedding and ontology based medical concept vectors was 0.8% higher than the model with randomly initialized vectors and 1.5% higher than the one with pre-trained word vectors. CONCLUSION: Our study demonstrates that CNN model with pre-trained medical concept vectors could accurately identify target report-pairs with overlapping body sites and potentially accelerate the retrieving process for imaging diagnosis quality measurement.
format	Online Article Text
id	pubmed-6686478
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-66864782019-08-12 Detection of medical text semantic similarity based on convolutional neural network Zheng, Tao Gao, Yimei Wang, Fei Fan, Chenhao Fu, Xingzhi Li, Mei Zhang, Ya Zhang, Shaodian Ma, Handong BMC Med Inform Decis Mak Research Article BACKGROUND: Imaging examinations, such as ultrasonography, magnetic resonance imaging and computed tomography scans, play key roles in healthcare settings. To assess and improve the quality of imaging diagnosis, we need to manually find and compare the pre-existing reports of imaging and pathology examinations which contain overlapping exam body sites from electrical medical records (EMRs). The process of retrieving those reports is time-consuming. In this paper, we propose a convolutional neural network (CNN) based method which can better utilize semantic information contained in report texts to accelerate the retrieving process. METHODS: We included 16,354 imaging and pathology report-pairs from 1926 patients who admitted to Shanghai Tongren Hospital and had ultrasonic examinations between 1st May 2017 and 31st July 2017. We adapted the CNN model to calculate the similarities among the report-pairs to identify target report-pairs with overlapping body sites, and compared the performance with other six conventional models, including keyword mapping, latent semantic analysis (LSA), latent Dirichlet allocation (LDA), Doc2Vec, Siamese long short term memory (LSTM) and a model based on named entity recognition (NER). We also utilized graph embedding method to enhance the word representation by capturing the semantic relations information from medical ontologies. Additionally, we used LIME algorithm to identify which features (or words) are decisive for the prediction results and improved the model interpretability. RESULTS: Experiment results showed that our CNN model gained significant improvement compared to all other conventional models on area under the receiver operating characteristic (AUROC), precision, recall and F1-score in our test dataset. The AUROC of our CNN models gained approximately 3–7% improvement. The AUROC of CNN model with graph-embedding and ontology based medical concept vectors was 0.8% higher than the model with randomly initialized vectors and 1.5% higher than the one with pre-trained word vectors. CONCLUSION: Our study demonstrates that CNN model with pre-trained medical concept vectors could accurately identify target report-pairs with overlapping body sites and potentially accelerate the retrieving process for imaging diagnosis quality measurement. BioMed Central 2019-08-07 /pmc/articles/PMC6686478/ /pubmed/31391038 http://dx.doi.org/10.1186/s12911-019-0880-2 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article Zheng, Tao Gao, Yimei Wang, Fei Fan, Chenhao Fu, Xingzhi Li, Mei Zhang, Ya Zhang, Shaodian Ma, Handong Detection of medical text semantic similarity based on convolutional neural network
title	Detection of medical text semantic similarity based on convolutional neural network
title_full	Detection of medical text semantic similarity based on convolutional neural network
title_fullStr	Detection of medical text semantic similarity based on convolutional neural network
title_full_unstemmed	Detection of medical text semantic similarity based on convolutional neural network
title_short	Detection of medical text semantic similarity based on convolutional neural network
title_sort	detection of medical text semantic similarity based on convolutional neural network
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6686478/ https://www.ncbi.nlm.nih.gov/pubmed/31391038 http://dx.doi.org/10.1186/s12911-019-0880-2
work_keys_str_mv	AT zhengtao detectionofmedicaltextsemanticsimilaritybasedonconvolutionalneuralnetwork AT gaoyimei detectionofmedicaltextsemanticsimilaritybasedonconvolutionalneuralnetwork AT wangfei detectionofmedicaltextsemanticsimilaritybasedonconvolutionalneuralnetwork AT fanchenhao detectionofmedicaltextsemanticsimilaritybasedonconvolutionalneuralnetwork AT fuxingzhi detectionofmedicaltextsemanticsimilaritybasedonconvolutionalneuralnetwork AT limei detectionofmedicaltextsemanticsimilaritybasedonconvolutionalneuralnetwork AT zhangya detectionofmedicaltextsemanticsimilaritybasedonconvolutionalneuralnetwork AT zhangshaodian detectionofmedicaltextsemanticsimilaritybasedonconvolutionalneuralnetwork AT mahandong detectionofmedicaltextsemanticsimilaritybasedonconvolutionalneuralnetwork

Detection of medical text semantic similarity based on convolutional neural network

Ejemplares similares