Cargando…

Deep-Learning Model for Influenza Prediction From Multisource Heterogeneous Data in a Megacity: Model Development and Evaluation

BACKGROUND: In megacities, there is an urgent need to establish more sensitive forecasting and early warning methods for acute respiratory infectious diseases. Existing prediction and early warning models for influenza and other acute respiratory infectious diseases have limitations and therefore th...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Liuyang, Li, Gang, Yang, Jin, Zhang, Ting, Du, Jing, Liu, Tian, Zhang, Xingxing, Han, Xuan, Li, Wei, Ma, Libing, Feng, Luzhao, Yang, Weizhong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9972203/
https://www.ncbi.nlm.nih.gov/pubmed/36780207
http://dx.doi.org/10.2196/44238
_version_ 1784898273445675008
author Yang, Liuyang
Li, Gang
Yang, Jin
Zhang, Ting
Du, Jing
Liu, Tian
Zhang, Xingxing
Han, Xuan
Li, Wei
Ma, Libing
Feng, Luzhao
Yang, Weizhong
author_facet Yang, Liuyang
Li, Gang
Yang, Jin
Zhang, Ting
Du, Jing
Liu, Tian
Zhang, Xingxing
Han, Xuan
Li, Wei
Ma, Libing
Feng, Luzhao
Yang, Weizhong
author_sort Yang, Liuyang
collection PubMed
description BACKGROUND: In megacities, there is an urgent need to establish more sensitive forecasting and early warning methods for acute respiratory infectious diseases. Existing prediction and early warning models for influenza and other acute respiratory infectious diseases have limitations and therefore there is room for improvement. OBJECTIVE: The aim of this study was to explore a new and better-performing deep-learning model to predict influenza trends from multisource heterogeneous data in a megacity. METHODS: We collected multisource heterogeneous data from the 26th week of 2012 to the 25th week of 2019, including influenza-like illness (ILI) cases and virological surveillance, data of climate and demography, and search engines data. To avoid collinearity, we selected the best predictor according to the weight and correlation of each factor. We established a new multiattention-long short-term memory (LSTM) deep-learning model (MAL model), which was used to predict the percentage of ILI (ILI%) cases and the product of ILI% and the influenza-positive rate (ILI%×positive%), respectively. We also combined the data in different forms and added several machine-learning and deep-learning models commonly used in the past to predict influenza trends for comparison. The R(2) value, explained variance scores, mean absolute error, and mean square error were used to evaluate the quality of the models. RESULTS: The highest correlation coefficients were found for the Baidu search data for ILI% and for air quality for ILI%×positive%. We first used the MAL model to calculate the ILI%, and then combined ILI% with climate, demographic, and Baidu data in different forms. The ILI%+climate+demography+Baidu model had the best prediction effect, with the explained variance score reaching 0.78, R(2) reaching 0.76, mean absolute error of 0.08, and mean squared error of 0.01. Similarly, we used the MAL model to calculate the ILI%×positive% and combined this prediction with different data forms. The ILI%×positive%+climate+demography+Baidu model had the best prediction effect, with an explained variance score reaching 0.74, R(2) reaching 0.70, mean absolute error of 0.02, and mean squared error of 0.02. Comparisons with random forest, extreme gradient boosting, LSTM, and gated current unit models showed that the MAL model had the best prediction effect. CONCLUSIONS: The newly established MAL model outperformed existing models. Natural factors and search engine query data were more helpful in forecasting ILI patterns in megacities. With more timely and effective prediction of influenza and other respiratory infectious diseases and the epidemic intensity, early and better preparedness can be achieved to reduce the health damage to the population.
format Online
Article
Text
id pubmed-9972203
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-99722032023-03-01 Deep-Learning Model for Influenza Prediction From Multisource Heterogeneous Data in a Megacity: Model Development and Evaluation Yang, Liuyang Li, Gang Yang, Jin Zhang, Ting Du, Jing Liu, Tian Zhang, Xingxing Han, Xuan Li, Wei Ma, Libing Feng, Luzhao Yang, Weizhong J Med Internet Res Original Paper BACKGROUND: In megacities, there is an urgent need to establish more sensitive forecasting and early warning methods for acute respiratory infectious diseases. Existing prediction and early warning models for influenza and other acute respiratory infectious diseases have limitations and therefore there is room for improvement. OBJECTIVE: The aim of this study was to explore a new and better-performing deep-learning model to predict influenza trends from multisource heterogeneous data in a megacity. METHODS: We collected multisource heterogeneous data from the 26th week of 2012 to the 25th week of 2019, including influenza-like illness (ILI) cases and virological surveillance, data of climate and demography, and search engines data. To avoid collinearity, we selected the best predictor according to the weight and correlation of each factor. We established a new multiattention-long short-term memory (LSTM) deep-learning model (MAL model), which was used to predict the percentage of ILI (ILI%) cases and the product of ILI% and the influenza-positive rate (ILI%×positive%), respectively. We also combined the data in different forms and added several machine-learning and deep-learning models commonly used in the past to predict influenza trends for comparison. The R(2) value, explained variance scores, mean absolute error, and mean square error were used to evaluate the quality of the models. RESULTS: The highest correlation coefficients were found for the Baidu search data for ILI% and for air quality for ILI%×positive%. We first used the MAL model to calculate the ILI%, and then combined ILI% with climate, demographic, and Baidu data in different forms. The ILI%+climate+demography+Baidu model had the best prediction effect, with the explained variance score reaching 0.78, R(2) reaching 0.76, mean absolute error of 0.08, and mean squared error of 0.01. Similarly, we used the MAL model to calculate the ILI%×positive% and combined this prediction with different data forms. The ILI%×positive%+climate+demography+Baidu model had the best prediction effect, with an explained variance score reaching 0.74, R(2) reaching 0.70, mean absolute error of 0.02, and mean squared error of 0.02. Comparisons with random forest, extreme gradient boosting, LSTM, and gated current unit models showed that the MAL model had the best prediction effect. CONCLUSIONS: The newly established MAL model outperformed existing models. Natural factors and search engine query data were more helpful in forecasting ILI patterns in megacities. With more timely and effective prediction of influenza and other respiratory infectious diseases and the epidemic intensity, early and better preparedness can be achieved to reduce the health damage to the population. JMIR Publications 2023-02-13 /pmc/articles/PMC9972203/ /pubmed/36780207 http://dx.doi.org/10.2196/44238 Text en ©Liuyang Yang, Gang Li, Jin Yang, Ting Zhang, Jing Du, Tian Liu, Xingxing Zhang, Xuan Han, Wei Li, Libing Ma, Luzhao Feng, Weizhong Yang. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 13.02.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Yang, Liuyang
Li, Gang
Yang, Jin
Zhang, Ting
Du, Jing
Liu, Tian
Zhang, Xingxing
Han, Xuan
Li, Wei
Ma, Libing
Feng, Luzhao
Yang, Weizhong
Deep-Learning Model for Influenza Prediction From Multisource Heterogeneous Data in a Megacity: Model Development and Evaluation
title Deep-Learning Model for Influenza Prediction From Multisource Heterogeneous Data in a Megacity: Model Development and Evaluation
title_full Deep-Learning Model for Influenza Prediction From Multisource Heterogeneous Data in a Megacity: Model Development and Evaluation
title_fullStr Deep-Learning Model for Influenza Prediction From Multisource Heterogeneous Data in a Megacity: Model Development and Evaluation
title_full_unstemmed Deep-Learning Model for Influenza Prediction From Multisource Heterogeneous Data in a Megacity: Model Development and Evaluation
title_short Deep-Learning Model for Influenza Prediction From Multisource Heterogeneous Data in a Megacity: Model Development and Evaluation
title_sort deep-learning model for influenza prediction from multisource heterogeneous data in a megacity: model development and evaluation
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9972203/
https://www.ncbi.nlm.nih.gov/pubmed/36780207
http://dx.doi.org/10.2196/44238
work_keys_str_mv AT yangliuyang deeplearningmodelforinfluenzapredictionfrommultisourceheterogeneousdatainamegacitymodeldevelopmentandevaluation
AT ligang deeplearningmodelforinfluenzapredictionfrommultisourceheterogeneousdatainamegacitymodeldevelopmentandevaluation
AT yangjin deeplearningmodelforinfluenzapredictionfrommultisourceheterogeneousdatainamegacitymodeldevelopmentandevaluation
AT zhangting deeplearningmodelforinfluenzapredictionfrommultisourceheterogeneousdatainamegacitymodeldevelopmentandevaluation
AT dujing deeplearningmodelforinfluenzapredictionfrommultisourceheterogeneousdatainamegacitymodeldevelopmentandevaluation
AT liutian deeplearningmodelforinfluenzapredictionfrommultisourceheterogeneousdatainamegacitymodeldevelopmentandevaluation
AT zhangxingxing deeplearningmodelforinfluenzapredictionfrommultisourceheterogeneousdatainamegacitymodeldevelopmentandevaluation
AT hanxuan deeplearningmodelforinfluenzapredictionfrommultisourceheterogeneousdatainamegacitymodeldevelopmentandevaluation
AT liwei deeplearningmodelforinfluenzapredictionfrommultisourceheterogeneousdatainamegacitymodeldevelopmentandevaluation
AT malibing deeplearningmodelforinfluenzapredictionfrommultisourceheterogeneousdatainamegacitymodeldevelopmentandevaluation
AT fengluzhao deeplearningmodelforinfluenzapredictionfrommultisourceheterogeneousdatainamegacitymodeldevelopmentandevaluation
AT yangweizhong deeplearningmodelforinfluenzapredictionfrommultisourceheterogeneousdatainamegacitymodeldevelopmentandevaluation