Cargando…

An intelligent prediagnosis system for disease prediction and examination recommendation based on electronic medical record and a medical-semantic-aware convolution neural network (MSCNN) for pediatric chronic cough

BACKGROUND: Due to the phenotypic similarities among different pediatric respiratory diseases with chronic cough, primary doctors often misdiagnose and the misuse of examinations is prevalent. In the pre-diagnosis stage, the patients' chief complaints and other information in the electronic med...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Zhu, Li, Jing, Huang, Jian, Li, Zheming, Zhang, Hongjian, Chen, Siyu, Zhong, Qianhui, Xie, Yulan, Hu, Shasha, Wang, Yinshuo, Wang, Dejian, Yu, Gang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: AME Publishing Company 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9360821/
https://www.ncbi.nlm.nih.gov/pubmed/35958012
http://dx.doi.org/10.21037/tp-22-275
Descripción
Sumario:BACKGROUND: Due to the phenotypic similarities among different pediatric respiratory diseases with chronic cough, primary doctors often misdiagnose and the misuse of examinations is prevalent. In the pre-diagnosis stage, the patients' chief complaints and other information in the electronic medical record (EMR) provide a powerful reference for respiratory experts to make preliminary disease judgment and examination plan. In this paper, we proposed an intelligent prediagnosis system to predict disease diagnosis and recommend examinations based on EMR text. METHODS: We examined the clinical notes of 178,293 children with chronic cough symptoms from retrospective EMR data. The dataset is split into 7:3 for training and testing. From the testing set, we also extract 5% of samples for validation. We proposed a medical-semantic-aware convolution neural network (MSCNN) framework that can accomplish two downstream tasks from the same medical language model through transfer learning. First, a medical language model based on the word2vec algorithm was built to generate embeddings for the text data. Then, text convolutional neural network (TextCNN) was used to build models for disease prediction and examination recommendation. RESULTS: We implemented 5 algorithms for disease prediction. In the disease prediction task, our algorithm outperformed the baseline methods on all metrics, with a top-1 accuracy (AC) of 0.68 and a top-3 AC of 0.923 on the testing set. By adding data enhancement, the top-3 AC reached 0.926. In the examination recommendation task, the overall AC on the testing set was 0.93 and the macro average (MA) F1-score was 0.88. The average area under the curve (AUC) on the training set was 0.97 while on the testing set it was 0.86. CONCLUSIONS: We constructed an intelligent prediagnosis system with an MSCNN framework that can predict diseases and make examination recommendations based on EMR data. Our approach achieved good results on a retrospective clinical dataset and thus has great potential for the application of automated diagnosis assist in clinical practice during pre-diagnosis stage, which will provide help for primary level doctors or doctors in basic-level hospitals. Due to the generality of the proposed framework, it can be straight forwardly extended to prediagnosis for other diseases.