Cargando…
Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records
PURPOSE: Predictively diagnosing infectious diseases helps in providing better treatment and enhances the prevention and control of such diseases. This study uses actual data from a hospital. A multiple infectious disease diagnostic model (MIDDM) is designed for conducting multi-classification of in...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8848865/ https://www.ncbi.nlm.nih.gov/pubmed/35168624 http://dx.doi.org/10.1186/s12911-022-01776-y |
_version_ | 1784652346797588480 |
---|---|
author | Wang, Mengying Wei, Zhenhao Jia, Mo Chen, Lianzhong Ji, Hong |
author_facet | Wang, Mengying Wei, Zhenhao Jia, Mo Chen, Lianzhong Ji, Hong |
author_sort | Wang, Mengying |
collection | PubMed |
description | PURPOSE: Predictively diagnosing infectious diseases helps in providing better treatment and enhances the prevention and control of such diseases. This study uses actual data from a hospital. A multiple infectious disease diagnostic model (MIDDM) is designed for conducting multi-classification of infectious diseases so as to assist in clinical infectious-disease decision-making. METHODS: Based on actual hospital medical records of infectious diseases from December 2012 to December 2020, a deep learning model for multi-classification research on infectious diseases is constructed. The data includes 20,620 cases covering seven types of infectious diseases, including outpatients and inpatients, of which training data accounted for 80%, i.e., 16,496 cases, and test data accounted for 20%, i.e., 4124 cases. Through the auto-encoder, data normalization and sparse data densification processing are carried out to improve the model training effect. A residual network and attention mechanism are introduced into the MIDDM model to improve the performance of the model. RESULT: MIDDM achieved improved prediction results in diagnosing seven kinds of infectious diseases. In the case of similar disease diagnosis characteristics and similar interference factors, the prediction accuracy of disease classification with more sample data is significantly higher than the prediction accuracy of disease classification with fewer sample data. For instance, the training data for viral hepatitis, influenza, and hand foot and mouth disease were 2954, 3924, and 3015 respectively and the corresponding test accuracy rates were 99.86%, 98.47%, and 97.31%. There is less training data for syphilis, infectious diarrhea, and measles, i.e., 1208, 575, and 190 respectively and the corresponding test accuracy rates were noticeably lower, i.e., 83.03%, 87.30%, and42.11%. We also compared the MIDDM model with the models used in other studies. Using the same input data, taking viral hepatitis as an example, the accuracy of MIDDM is 99.44%, which is significantly higher than that of XGBoost (96.19%), Decision tree (90.13%), Bayesian method (85.19%), and logistic regression (91.26%). Other diseases were also significantly better predicted by MIDDM than by these three models. CONCLUSION: The application of the MIDDM model to multi-class diagnosis and prediction of infectious diseases can improve the accuracy of infectious-disease diagnosis. However, these results need to be further confirmed via clinical randomized controlled trials. |
format | Online Article Text |
id | pubmed-8848865 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-88488652022-02-18 Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records Wang, Mengying Wei, Zhenhao Jia, Mo Chen, Lianzhong Ji, Hong BMC Med Inform Decis Mak Research PURPOSE: Predictively diagnosing infectious diseases helps in providing better treatment and enhances the prevention and control of such diseases. This study uses actual data from a hospital. A multiple infectious disease diagnostic model (MIDDM) is designed for conducting multi-classification of infectious diseases so as to assist in clinical infectious-disease decision-making. METHODS: Based on actual hospital medical records of infectious diseases from December 2012 to December 2020, a deep learning model for multi-classification research on infectious diseases is constructed. The data includes 20,620 cases covering seven types of infectious diseases, including outpatients and inpatients, of which training data accounted for 80%, i.e., 16,496 cases, and test data accounted for 20%, i.e., 4124 cases. Through the auto-encoder, data normalization and sparse data densification processing are carried out to improve the model training effect. A residual network and attention mechanism are introduced into the MIDDM model to improve the performance of the model. RESULT: MIDDM achieved improved prediction results in diagnosing seven kinds of infectious diseases. In the case of similar disease diagnosis characteristics and similar interference factors, the prediction accuracy of disease classification with more sample data is significantly higher than the prediction accuracy of disease classification with fewer sample data. For instance, the training data for viral hepatitis, influenza, and hand foot and mouth disease were 2954, 3924, and 3015 respectively and the corresponding test accuracy rates were 99.86%, 98.47%, and 97.31%. There is less training data for syphilis, infectious diarrhea, and measles, i.e., 1208, 575, and 190 respectively and the corresponding test accuracy rates were noticeably lower, i.e., 83.03%, 87.30%, and42.11%. We also compared the MIDDM model with the models used in other studies. Using the same input data, taking viral hepatitis as an example, the accuracy of MIDDM is 99.44%, which is significantly higher than that of XGBoost (96.19%), Decision tree (90.13%), Bayesian method (85.19%), and logistic regression (91.26%). Other diseases were also significantly better predicted by MIDDM than by these three models. CONCLUSION: The application of the MIDDM model to multi-class diagnosis and prediction of infectious diseases can improve the accuracy of infectious-disease diagnosis. However, these results need to be further confirmed via clinical randomized controlled trials. BioMed Central 2022-02-16 /pmc/articles/PMC8848865/ /pubmed/35168624 http://dx.doi.org/10.1186/s12911-022-01776-y Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Wang, Mengying Wei, Zhenhao Jia, Mo Chen, Lianzhong Ji, Hong Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records |
title | Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records |
title_full | Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records |
title_fullStr | Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records |
title_full_unstemmed | Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records |
title_short | Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records |
title_sort | deep learning model for multi-classification of infectious diseases from unstructured electronic medical records |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8848865/ https://www.ncbi.nlm.nih.gov/pubmed/35168624 http://dx.doi.org/10.1186/s12911-022-01776-y |
work_keys_str_mv | AT wangmengying deeplearningmodelformulticlassificationofinfectiousdiseasesfromunstructuredelectronicmedicalrecords AT weizhenhao deeplearningmodelformulticlassificationofinfectiousdiseasesfromunstructuredelectronicmedicalrecords AT jiamo deeplearningmodelformulticlassificationofinfectiousdiseasesfromunstructuredelectronicmedicalrecords AT chenlianzhong deeplearningmodelformulticlassificationofinfectiousdiseasesfromunstructuredelectronicmedicalrecords AT jihong deeplearningmodelformulticlassificationofinfectiousdiseasesfromunstructuredelectronicmedicalrecords |