Cargando…

Emerging infectious disease surveillance using a hierarchical diagnosis model and the Knox algorithm

Emerging infectious diseases are a critical public health challenge in the twenty-first century. The recent proliferation of such diseases has raised major social and economic concerns. Therefore, early detection of emerging infectious diseases is essential. Subjects from five medical institutions i...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Mengying, Yang, Bingqing, Liu, Yunpeng, Yang, Yingyun, Ji, Hong, Yang, Cheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10645817/
https://www.ncbi.nlm.nih.gov/pubmed/37963966
http://dx.doi.org/10.1038/s41598-023-47010-1
_version_ 1785147410401460224
author Wang, Mengying
Yang, Bingqing
Liu, Yunpeng
Yang, Yingyun
Ji, Hong
Yang, Cheng
author_facet Wang, Mengying
Yang, Bingqing
Liu, Yunpeng
Yang, Yingyun
Ji, Hong
Yang, Cheng
author_sort Wang, Mengying
collection PubMed
description Emerging infectious diseases are a critical public health challenge in the twenty-first century. The recent proliferation of such diseases has raised major social and economic concerns. Therefore, early detection of emerging infectious diseases is essential. Subjects from five medical institutions in Beijing, China, which met the spatial-specific requirements, were analyzed. A quality control process was used to select 37,422 medical records of infectious diseases and 56,133 cases of non-infectious diseases. An emerging infectious disease detection model (EIDDM), a two-layer model that divides the problem into two sub-problems, i.e., whether a case is an infectious disease, and if so, whether it is a known infectious disease, was proposed. The first layer model adopts the binary classification model TextCNN-Attention. The second layer is a multi-classification model of LightGBM based on the one-vs-rest strategy. Based on the experimental results, a threshold of 0.5 is selected. The model results were compared with those of other models such as XGBoost and Random Forest using the following evaluation indicators: accuracy, sensitivity, specificity, positive predictive value, and negative predictive value. The prediction performance of the first-layer TextCNN is better than that of other comparison models. Its average specificity for non-infectious diseases is 97.57%, with an average negative predictive value of 82.63%, indicating a low risk of misdiagnosing non-infectious diseases as infectious (i.e., a low false positive rate). Its average positive predictive value for eight selected infectious diseases is 95.07%, demonstrating the model's ability to avoid misdiagnoses. The overall average accuracy of the model is 86.11%. The average prediction accuracy of the second-layer LightGBM model for emerging infectious diseases reaches 90.44%. Furthermore, the response time of a single online reasoning using the LightGBM model is approximately 27 ms, which makes it suitable for analyzing clinical records in real time. Using the Knox method, we found that all the infectious diseases were within 2000 m in our case, and a clustering feature of spatiotemporal interactions (P < 0.05) was observed as well. Performance testing and model comparison results indicated that the EIDDM is fast and accurate and can be used to monitor the onset/outbreak of emerging infectious diseases in real-world hospitals.
format Online
Article
Text
id pubmed-10645817
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-106458172023-11-13 Emerging infectious disease surveillance using a hierarchical diagnosis model and the Knox algorithm Wang, Mengying Yang, Bingqing Liu, Yunpeng Yang, Yingyun Ji, Hong Yang, Cheng Sci Rep Article Emerging infectious diseases are a critical public health challenge in the twenty-first century. The recent proliferation of such diseases has raised major social and economic concerns. Therefore, early detection of emerging infectious diseases is essential. Subjects from five medical institutions in Beijing, China, which met the spatial-specific requirements, were analyzed. A quality control process was used to select 37,422 medical records of infectious diseases and 56,133 cases of non-infectious diseases. An emerging infectious disease detection model (EIDDM), a two-layer model that divides the problem into two sub-problems, i.e., whether a case is an infectious disease, and if so, whether it is a known infectious disease, was proposed. The first layer model adopts the binary classification model TextCNN-Attention. The second layer is a multi-classification model of LightGBM based on the one-vs-rest strategy. Based on the experimental results, a threshold of 0.5 is selected. The model results were compared with those of other models such as XGBoost and Random Forest using the following evaluation indicators: accuracy, sensitivity, specificity, positive predictive value, and negative predictive value. The prediction performance of the first-layer TextCNN is better than that of other comparison models. Its average specificity for non-infectious diseases is 97.57%, with an average negative predictive value of 82.63%, indicating a low risk of misdiagnosing non-infectious diseases as infectious (i.e., a low false positive rate). Its average positive predictive value for eight selected infectious diseases is 95.07%, demonstrating the model's ability to avoid misdiagnoses. The overall average accuracy of the model is 86.11%. The average prediction accuracy of the second-layer LightGBM model for emerging infectious diseases reaches 90.44%. Furthermore, the response time of a single online reasoning using the LightGBM model is approximately 27 ms, which makes it suitable for analyzing clinical records in real time. Using the Knox method, we found that all the infectious diseases were within 2000 m in our case, and a clustering feature of spatiotemporal interactions (P < 0.05) was observed as well. Performance testing and model comparison results indicated that the EIDDM is fast and accurate and can be used to monitor the onset/outbreak of emerging infectious diseases in real-world hospitals. Nature Publishing Group UK 2023-11-13 /pmc/articles/PMC10645817/ /pubmed/37963966 http://dx.doi.org/10.1038/s41598-023-47010-1 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Wang, Mengying
Yang, Bingqing
Liu, Yunpeng
Yang, Yingyun
Ji, Hong
Yang, Cheng
Emerging infectious disease surveillance using a hierarchical diagnosis model and the Knox algorithm
title Emerging infectious disease surveillance using a hierarchical diagnosis model and the Knox algorithm
title_full Emerging infectious disease surveillance using a hierarchical diagnosis model and the Knox algorithm
title_fullStr Emerging infectious disease surveillance using a hierarchical diagnosis model and the Knox algorithm
title_full_unstemmed Emerging infectious disease surveillance using a hierarchical diagnosis model and the Knox algorithm
title_short Emerging infectious disease surveillance using a hierarchical diagnosis model and the Knox algorithm
title_sort emerging infectious disease surveillance using a hierarchical diagnosis model and the knox algorithm
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10645817/
https://www.ncbi.nlm.nih.gov/pubmed/37963966
http://dx.doi.org/10.1038/s41598-023-47010-1
work_keys_str_mv AT wangmengying emerginginfectiousdiseasesurveillanceusingahierarchicaldiagnosismodelandtheknoxalgorithm
AT yangbingqing emerginginfectiousdiseasesurveillanceusingahierarchicaldiagnosismodelandtheknoxalgorithm
AT liuyunpeng emerginginfectiousdiseasesurveillanceusingahierarchicaldiagnosismodelandtheknoxalgorithm
AT yangyingyun emerginginfectiousdiseasesurveillanceusingahierarchicaldiagnosismodelandtheknoxalgorithm
AT jihong emerginginfectiousdiseasesurveillanceusingahierarchicaldiagnosismodelandtheknoxalgorithm
AT yangcheng emerginginfectiousdiseasesurveillanceusingahierarchicaldiagnosismodelandtheknoxalgorithm