Cargando…

Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records

PURPOSE: Predictively diagnosing infectious diseases helps in providing better treatment and enhances the prevention and control of such diseases. This study uses actual data from a hospital. A multiple infectious disease diagnostic model (MIDDM) is designed for conducting multi-classification of in...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Mengying, Wei, Zhenhao, Jia, Mo, Chen, Lianzhong, Ji, Hong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8848865/
https://www.ncbi.nlm.nih.gov/pubmed/35168624
http://dx.doi.org/10.1186/s12911-022-01776-y
_version_ 1784652346797588480
author Wang, Mengying
Wei, Zhenhao
Jia, Mo
Chen, Lianzhong
Ji, Hong
author_facet Wang, Mengying
Wei, Zhenhao
Jia, Mo
Chen, Lianzhong
Ji, Hong
author_sort Wang, Mengying
collection PubMed
description PURPOSE: Predictively diagnosing infectious diseases helps in providing better treatment and enhances the prevention and control of such diseases. This study uses actual data from a hospital. A multiple infectious disease diagnostic model (MIDDM) is designed for conducting multi-classification of infectious diseases so as to assist in clinical infectious-disease decision-making. METHODS: Based on actual hospital medical records of infectious diseases from December 2012 to December 2020, a deep learning model for multi-classification research on infectious diseases is constructed. The data includes 20,620 cases covering seven types of infectious diseases, including outpatients and inpatients, of which training data accounted for 80%, i.e., 16,496 cases, and test data accounted for 20%, i.e., 4124 cases. Through the auto-encoder, data normalization and sparse data densification processing are carried out to improve the model training effect. A residual network and attention mechanism are introduced into the MIDDM model to improve the performance of the model. RESULT: MIDDM achieved improved prediction results in diagnosing seven kinds of infectious diseases. In the case of similar disease diagnosis characteristics and similar interference factors, the prediction accuracy of disease classification with more sample data is significantly higher than the prediction accuracy of disease classification with fewer sample data. For instance, the training data for viral hepatitis, influenza, and hand foot and mouth disease were 2954, 3924, and 3015 respectively and the corresponding test accuracy rates were 99.86%, 98.47%, and 97.31%. There is less training data for syphilis, infectious diarrhea, and measles, i.e., 1208, 575, and 190 respectively and the corresponding test accuracy rates were noticeably lower, i.e., 83.03%, 87.30%, and42.11%. We also compared the MIDDM model with the models used in other studies. Using the same input data, taking viral hepatitis as an example, the accuracy of MIDDM is 99.44%, which is significantly higher than that of XGBoost (96.19%), Decision tree (90.13%), Bayesian method (85.19%), and logistic regression (91.26%). Other diseases were also significantly better predicted by MIDDM than by these three models. CONCLUSION: The application of the MIDDM model to multi-class diagnosis and prediction of infectious diseases can improve the accuracy of infectious-disease diagnosis. However, these results need to be further confirmed via clinical randomized controlled trials.
format Online
Article
Text
id pubmed-8848865
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-88488652022-02-18 Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records Wang, Mengying Wei, Zhenhao Jia, Mo Chen, Lianzhong Ji, Hong BMC Med Inform Decis Mak Research PURPOSE: Predictively diagnosing infectious diseases helps in providing better treatment and enhances the prevention and control of such diseases. This study uses actual data from a hospital. A multiple infectious disease diagnostic model (MIDDM) is designed for conducting multi-classification of infectious diseases so as to assist in clinical infectious-disease decision-making. METHODS: Based on actual hospital medical records of infectious diseases from December 2012 to December 2020, a deep learning model for multi-classification research on infectious diseases is constructed. The data includes 20,620 cases covering seven types of infectious diseases, including outpatients and inpatients, of which training data accounted for 80%, i.e., 16,496 cases, and test data accounted for 20%, i.e., 4124 cases. Through the auto-encoder, data normalization and sparse data densification processing are carried out to improve the model training effect. A residual network and attention mechanism are introduced into the MIDDM model to improve the performance of the model. RESULT: MIDDM achieved improved prediction results in diagnosing seven kinds of infectious diseases. In the case of similar disease diagnosis characteristics and similar interference factors, the prediction accuracy of disease classification with more sample data is significantly higher than the prediction accuracy of disease classification with fewer sample data. For instance, the training data for viral hepatitis, influenza, and hand foot and mouth disease were 2954, 3924, and 3015 respectively and the corresponding test accuracy rates were 99.86%, 98.47%, and 97.31%. There is less training data for syphilis, infectious diarrhea, and measles, i.e., 1208, 575, and 190 respectively and the corresponding test accuracy rates were noticeably lower, i.e., 83.03%, 87.30%, and42.11%. We also compared the MIDDM model with the models used in other studies. Using the same input data, taking viral hepatitis as an example, the accuracy of MIDDM is 99.44%, which is significantly higher than that of XGBoost (96.19%), Decision tree (90.13%), Bayesian method (85.19%), and logistic regression (91.26%). Other diseases were also significantly better predicted by MIDDM than by these three models. CONCLUSION: The application of the MIDDM model to multi-class diagnosis and prediction of infectious diseases can improve the accuracy of infectious-disease diagnosis. However, these results need to be further confirmed via clinical randomized controlled trials. BioMed Central 2022-02-16 /pmc/articles/PMC8848865/ /pubmed/35168624 http://dx.doi.org/10.1186/s12911-022-01776-y Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Wang, Mengying
Wei, Zhenhao
Jia, Mo
Chen, Lianzhong
Ji, Hong
Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records
title Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records
title_full Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records
title_fullStr Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records
title_full_unstemmed Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records
title_short Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records
title_sort deep learning model for multi-classification of infectious diseases from unstructured electronic medical records
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8848865/
https://www.ncbi.nlm.nih.gov/pubmed/35168624
http://dx.doi.org/10.1186/s12911-022-01776-y
work_keys_str_mv AT wangmengying deeplearningmodelformulticlassificationofinfectiousdiseasesfromunstructuredelectronicmedicalrecords
AT weizhenhao deeplearningmodelformulticlassificationofinfectiousdiseasesfromunstructuredelectronicmedicalrecords
AT jiamo deeplearningmodelformulticlassificationofinfectiousdiseasesfromunstructuredelectronicmedicalrecords
AT chenlianzhong deeplearningmodelformulticlassificationofinfectiousdiseasesfromunstructuredelectronicmedicalrecords
AT jihong deeplearningmodelformulticlassificationofinfectiousdiseasesfromunstructuredelectronicmedicalrecords