Cargando…

Early Warning and Prediction of Scarlet Fever in China Using the Baidu Search Index and Autoregressive Integrated Moving Average With Explanatory Variable (ARIMAX) Model: Time Series Analysis

BACKGROUND: Internet-derived data and the autoregressive integrated moving average (ARIMA) and ARIMA with explanatory variable (ARIMAX) models are extensively used for infectious disease surveillance. However, the effectiveness of the Baidu search index (BSI) in predicting the incidence of scarlet f...

Descripción completa

Detalles Bibliográficos
Autores principales: Luo, Tingyan, Zhou, Jie, Yang, Jing, Xie, Yulan, Wei, Yiru, Mai, Huanzhuo, Lu, Dongjia, Yang, Yuecong, Cui, Ping, Ye, Li, Liang, Hao, Huang, Jiegang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10644180/
https://www.ncbi.nlm.nih.gov/pubmed/37902815
http://dx.doi.org/10.2196/49400
_version_ 1785134497991229440
author Luo, Tingyan
Zhou, Jie
Yang, Jing
Xie, Yulan
Wei, Yiru
Mai, Huanzhuo
Lu, Dongjia
Yang, Yuecong
Cui, Ping
Ye, Li
Liang, Hao
Huang, Jiegang
author_facet Luo, Tingyan
Zhou, Jie
Yang, Jing
Xie, Yulan
Wei, Yiru
Mai, Huanzhuo
Lu, Dongjia
Yang, Yuecong
Cui, Ping
Ye, Li
Liang, Hao
Huang, Jiegang
author_sort Luo, Tingyan
collection PubMed
description BACKGROUND: Internet-derived data and the autoregressive integrated moving average (ARIMA) and ARIMA with explanatory variable (ARIMAX) models are extensively used for infectious disease surveillance. However, the effectiveness of the Baidu search index (BSI) in predicting the incidence of scarlet fever remains uncertain. OBJECTIVE: Our objective was to investigate whether a low-cost BSI monitoring system could potentially function as a valuable complement to traditional scarlet fever surveillance in China. METHODS: ARIMA and ARIMAX models were developed to predict the incidence of scarlet fever in China using data from the National Health Commission of the People’s Republic of China between January 2011 and August 2022. The procedures included establishing a keyword database, keyword selection and filtering through Spearman rank correlation and cross-correlation analyses, construction of the scarlet fever comprehensive search index (CSI), modeling with the training sets, predicting with the testing sets, and comparing the prediction performances. RESULTS: The average monthly incidence of scarlet fever was 4462.17 (SD 3011.75) cases, and annual incidence exhibited an upward trend until 2019. The keyword database contained 52 keywords, but only 6 highly relevant ones were selected for modeling. A high Spearman rank correlation was observed between the scarlet fever reported cases and the scarlet fever CSI (r(s)=0.881). We developed the ARIMA(4,0,0)(0,1,2)((12)) model, and the ARIMA(4,0,0)(0,1,2)((12)) + CSI (Lag=0) and ARIMAX(1,0,2)(2,0,0)((12)) models were combined with the BSI. The 3 models had a good fit and passed the residuals Ljung-Box test. The ARIMA(4,0,0)(0,1,2)((12)), ARIMA(4,0,0)(0,1,2)((12)) + CSI (Lag=0), and ARIMAX(1,0,2)(2,0,0)((12)) models demonstrated favorable predictive capabilities, with mean absolute errors of 1692.16 (95% CI 584.88-2799.44), 1067.89 (95% CI 402.02-1733.76), and 639.75 (95% CI 188.12-1091.38), respectively; root mean squared errors of 2036.92 (95% CI 929.64-3144.20), 1224.92 (95% CI 559.04-1890.79), and 830.80 (95% CI 379.17-1282.43), respectively; and mean absolute percentage errors of 4.33% (95% CI 0.54%-8.13%), 3.36% (95% CI –0.24% to 6.96%), and 2.16% (95% CI –0.69% to 5.00%), respectively. The ARIMAX models outperformed the ARIMA models and had better prediction performances with smaller values. CONCLUSIONS: This study demonstrated that the BSI can be used for the early warning and prediction of scarlet fever, serving as a valuable supplement to traditional surveillance systems.
format Online
Article
Text
id pubmed-10644180
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-106441802023-10-30 Early Warning and Prediction of Scarlet Fever in China Using the Baidu Search Index and Autoregressive Integrated Moving Average With Explanatory Variable (ARIMAX) Model: Time Series Analysis Luo, Tingyan Zhou, Jie Yang, Jing Xie, Yulan Wei, Yiru Mai, Huanzhuo Lu, Dongjia Yang, Yuecong Cui, Ping Ye, Li Liang, Hao Huang, Jiegang J Med Internet Res Original Paper BACKGROUND: Internet-derived data and the autoregressive integrated moving average (ARIMA) and ARIMA with explanatory variable (ARIMAX) models are extensively used for infectious disease surveillance. However, the effectiveness of the Baidu search index (BSI) in predicting the incidence of scarlet fever remains uncertain. OBJECTIVE: Our objective was to investigate whether a low-cost BSI monitoring system could potentially function as a valuable complement to traditional scarlet fever surveillance in China. METHODS: ARIMA and ARIMAX models were developed to predict the incidence of scarlet fever in China using data from the National Health Commission of the People’s Republic of China between January 2011 and August 2022. The procedures included establishing a keyword database, keyword selection and filtering through Spearman rank correlation and cross-correlation analyses, construction of the scarlet fever comprehensive search index (CSI), modeling with the training sets, predicting with the testing sets, and comparing the prediction performances. RESULTS: The average monthly incidence of scarlet fever was 4462.17 (SD 3011.75) cases, and annual incidence exhibited an upward trend until 2019. The keyword database contained 52 keywords, but only 6 highly relevant ones were selected for modeling. A high Spearman rank correlation was observed between the scarlet fever reported cases and the scarlet fever CSI (r(s)=0.881). We developed the ARIMA(4,0,0)(0,1,2)((12)) model, and the ARIMA(4,0,0)(0,1,2)((12)) + CSI (Lag=0) and ARIMAX(1,0,2)(2,0,0)((12)) models were combined with the BSI. The 3 models had a good fit and passed the residuals Ljung-Box test. The ARIMA(4,0,0)(0,1,2)((12)), ARIMA(4,0,0)(0,1,2)((12)) + CSI (Lag=0), and ARIMAX(1,0,2)(2,0,0)((12)) models demonstrated favorable predictive capabilities, with mean absolute errors of 1692.16 (95% CI 584.88-2799.44), 1067.89 (95% CI 402.02-1733.76), and 639.75 (95% CI 188.12-1091.38), respectively; root mean squared errors of 2036.92 (95% CI 929.64-3144.20), 1224.92 (95% CI 559.04-1890.79), and 830.80 (95% CI 379.17-1282.43), respectively; and mean absolute percentage errors of 4.33% (95% CI 0.54%-8.13%), 3.36% (95% CI –0.24% to 6.96%), and 2.16% (95% CI –0.69% to 5.00%), respectively. The ARIMAX models outperformed the ARIMA models and had better prediction performances with smaller values. CONCLUSIONS: This study demonstrated that the BSI can be used for the early warning and prediction of scarlet fever, serving as a valuable supplement to traditional surveillance systems. JMIR Publications 2023-10-30 /pmc/articles/PMC10644180/ /pubmed/37902815 http://dx.doi.org/10.2196/49400 Text en ©Tingyan Luo, Jie Zhou, Jing Yang, Yulan Xie, Yiru Wei, Huanzhuo Mai, Dongjia Lu, Yuecong Yang, Ping Cui, Li Ye, Hao Liang, Jiegang Huang. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 30.10.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Luo, Tingyan
Zhou, Jie
Yang, Jing
Xie, Yulan
Wei, Yiru
Mai, Huanzhuo
Lu, Dongjia
Yang, Yuecong
Cui, Ping
Ye, Li
Liang, Hao
Huang, Jiegang
Early Warning and Prediction of Scarlet Fever in China Using the Baidu Search Index and Autoregressive Integrated Moving Average With Explanatory Variable (ARIMAX) Model: Time Series Analysis
title Early Warning and Prediction of Scarlet Fever in China Using the Baidu Search Index and Autoregressive Integrated Moving Average With Explanatory Variable (ARIMAX) Model: Time Series Analysis
title_full Early Warning and Prediction of Scarlet Fever in China Using the Baidu Search Index and Autoregressive Integrated Moving Average With Explanatory Variable (ARIMAX) Model: Time Series Analysis
title_fullStr Early Warning and Prediction of Scarlet Fever in China Using the Baidu Search Index and Autoregressive Integrated Moving Average With Explanatory Variable (ARIMAX) Model: Time Series Analysis
title_full_unstemmed Early Warning and Prediction of Scarlet Fever in China Using the Baidu Search Index and Autoregressive Integrated Moving Average With Explanatory Variable (ARIMAX) Model: Time Series Analysis
title_short Early Warning and Prediction of Scarlet Fever in China Using the Baidu Search Index and Autoregressive Integrated Moving Average With Explanatory Variable (ARIMAX) Model: Time Series Analysis
title_sort early warning and prediction of scarlet fever in china using the baidu search index and autoregressive integrated moving average with explanatory variable (arimax) model: time series analysis
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10644180/
https://www.ncbi.nlm.nih.gov/pubmed/37902815
http://dx.doi.org/10.2196/49400
work_keys_str_mv AT luotingyan earlywarningandpredictionofscarletfeverinchinausingthebaidusearchindexandautoregressiveintegratedmovingaveragewithexplanatoryvariablearimaxmodeltimeseriesanalysis
AT zhoujie earlywarningandpredictionofscarletfeverinchinausingthebaidusearchindexandautoregressiveintegratedmovingaveragewithexplanatoryvariablearimaxmodeltimeseriesanalysis
AT yangjing earlywarningandpredictionofscarletfeverinchinausingthebaidusearchindexandautoregressiveintegratedmovingaveragewithexplanatoryvariablearimaxmodeltimeseriesanalysis
AT xieyulan earlywarningandpredictionofscarletfeverinchinausingthebaidusearchindexandautoregressiveintegratedmovingaveragewithexplanatoryvariablearimaxmodeltimeseriesanalysis
AT weiyiru earlywarningandpredictionofscarletfeverinchinausingthebaidusearchindexandautoregressiveintegratedmovingaveragewithexplanatoryvariablearimaxmodeltimeseriesanalysis
AT maihuanzhuo earlywarningandpredictionofscarletfeverinchinausingthebaidusearchindexandautoregressiveintegratedmovingaveragewithexplanatoryvariablearimaxmodeltimeseriesanalysis
AT ludongjia earlywarningandpredictionofscarletfeverinchinausingthebaidusearchindexandautoregressiveintegratedmovingaveragewithexplanatoryvariablearimaxmodeltimeseriesanalysis
AT yangyuecong earlywarningandpredictionofscarletfeverinchinausingthebaidusearchindexandautoregressiveintegratedmovingaveragewithexplanatoryvariablearimaxmodeltimeseriesanalysis
AT cuiping earlywarningandpredictionofscarletfeverinchinausingthebaidusearchindexandautoregressiveintegratedmovingaveragewithexplanatoryvariablearimaxmodeltimeseriesanalysis
AT yeli earlywarningandpredictionofscarletfeverinchinausingthebaidusearchindexandautoregressiveintegratedmovingaveragewithexplanatoryvariablearimaxmodeltimeseriesanalysis
AT lianghao earlywarningandpredictionofscarletfeverinchinausingthebaidusearchindexandautoregressiveintegratedmovingaveragewithexplanatoryvariablearimaxmodeltimeseriesanalysis
AT huangjiegang earlywarningandpredictionofscarletfeverinchinausingthebaidusearchindexandautoregressiveintegratedmovingaveragewithexplanatoryvariablearimaxmodeltimeseriesanalysis