Cargando…

Forecasting the COVID-19 Epidemic by Integrating Symptom Search Behavior Into Predictive Models: Infoveillance Study

BACKGROUND: Previous studies have suggested associations between trends of web searches and COVID-19 traditional metrics. It remains unclear whether models incorporating trends of digital searches lead to better predictions. OBJECTIVE: The aim of this study is to investigate the relationship between...

Descripción completa

Detalles Bibliográficos
Autores principales: Rabiolo, Alessandro, Alladio, Eugenio, Morales, Esteban, McNaught, Andrew Ian, Bandello, Francesco, Afifi, Abdelmonem A, Marchese, Alessandro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8360333/
https://www.ncbi.nlm.nih.gov/pubmed/34156966
http://dx.doi.org/10.2196/28876
_version_ 1783737720873418752
author Rabiolo, Alessandro
Alladio, Eugenio
Morales, Esteban
McNaught, Andrew Ian
Bandello, Francesco
Afifi, Abdelmonem A
Marchese, Alessandro
author_facet Rabiolo, Alessandro
Alladio, Eugenio
Morales, Esteban
McNaught, Andrew Ian
Bandello, Francesco
Afifi, Abdelmonem A
Marchese, Alessandro
author_sort Rabiolo, Alessandro
collection PubMed
description BACKGROUND: Previous studies have suggested associations between trends of web searches and COVID-19 traditional metrics. It remains unclear whether models incorporating trends of digital searches lead to better predictions. OBJECTIVE: The aim of this study is to investigate the relationship between Google Trends searches of symptoms associated with COVID-19 and confirmed COVID-19 cases and deaths. We aim to develop predictive models to forecast the COVID-19 epidemic based on a combination of Google Trends searches of symptoms and conventional COVID-19 metrics. METHODS: An open-access web application was developed to evaluate Google Trends and traditional COVID-19 metrics via an interactive framework based on principal component analysis (PCA) and time series modeling. The application facilitates the analysis of symptom search behavior associated with COVID-19 disease in 188 countries. In this study, we selected the data of nine countries as case studies to represent all continents. PCA was used to perform data dimensionality reduction, and three different time series models (error, trend, seasonality; autoregressive integrated moving average; and feed-forward neural network autoregression) were used to predict COVID-19 metrics in the upcoming 14 days. The models were compared in terms of prediction ability using the root mean square error (RMSE) of the first principal component (PC1). The predictive abilities of models generated with both Google Trends data and conventional COVID-19 metrics were compared with those fitted with conventional COVID-19 metrics only. RESULTS: The degree of correlation and the best time lag varied as a function of the selected country and topic searched; in general, the optimal time lag was within 15 days. Overall, predictions of PC1 based on both search terms and COVID-19 traditional metrics performed better than those not including Google searches (median 1.56, IQR 0.90-2.49 versus median 1.87, IQR 1.09-2.95, respectively), but the improvement in prediction varied as a function of the selected country and time frame. The best model varied as a function of country, time range, and period of time selected. Models based on a 7-day moving average led to considerably smaller RMSE values as opposed to those calculated with raw data (median 0.90, IQR 0.50-1.53 versus median 2.27, IQR 1.62-3.74, respectively). CONCLUSIONS: The inclusion of digital online searches in statistical models may improve the nowcasting and forecasting of the COVID-19 epidemic and could be used as one of the surveillance systems of COVID-19 disease. We provide a free web application operating with nearly real-time data that anyone can use to make predictions of outbreaks, improve estimates of the dynamics of ongoing epidemics, and predict future or rebound waves.
format Online
Article
Text
id pubmed-8360333
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-83603332021-08-25 Forecasting the COVID-19 Epidemic by Integrating Symptom Search Behavior Into Predictive Models: Infoveillance Study Rabiolo, Alessandro Alladio, Eugenio Morales, Esteban McNaught, Andrew Ian Bandello, Francesco Afifi, Abdelmonem A Marchese, Alessandro J Med Internet Res Original Paper BACKGROUND: Previous studies have suggested associations between trends of web searches and COVID-19 traditional metrics. It remains unclear whether models incorporating trends of digital searches lead to better predictions. OBJECTIVE: The aim of this study is to investigate the relationship between Google Trends searches of symptoms associated with COVID-19 and confirmed COVID-19 cases and deaths. We aim to develop predictive models to forecast the COVID-19 epidemic based on a combination of Google Trends searches of symptoms and conventional COVID-19 metrics. METHODS: An open-access web application was developed to evaluate Google Trends and traditional COVID-19 metrics via an interactive framework based on principal component analysis (PCA) and time series modeling. The application facilitates the analysis of symptom search behavior associated with COVID-19 disease in 188 countries. In this study, we selected the data of nine countries as case studies to represent all continents. PCA was used to perform data dimensionality reduction, and three different time series models (error, trend, seasonality; autoregressive integrated moving average; and feed-forward neural network autoregression) were used to predict COVID-19 metrics in the upcoming 14 days. The models were compared in terms of prediction ability using the root mean square error (RMSE) of the first principal component (PC1). The predictive abilities of models generated with both Google Trends data and conventional COVID-19 metrics were compared with those fitted with conventional COVID-19 metrics only. RESULTS: The degree of correlation and the best time lag varied as a function of the selected country and topic searched; in general, the optimal time lag was within 15 days. Overall, predictions of PC1 based on both search terms and COVID-19 traditional metrics performed better than those not including Google searches (median 1.56, IQR 0.90-2.49 versus median 1.87, IQR 1.09-2.95, respectively), but the improvement in prediction varied as a function of the selected country and time frame. The best model varied as a function of country, time range, and period of time selected. Models based on a 7-day moving average led to considerably smaller RMSE values as opposed to those calculated with raw data (median 0.90, IQR 0.50-1.53 versus median 2.27, IQR 1.62-3.74, respectively). CONCLUSIONS: The inclusion of digital online searches in statistical models may improve the nowcasting and forecasting of the COVID-19 epidemic and could be used as one of the surveillance systems of COVID-19 disease. We provide a free web application operating with nearly real-time data that anyone can use to make predictions of outbreaks, improve estimates of the dynamics of ongoing epidemics, and predict future or rebound waves. JMIR Publications 2021-08-11 /pmc/articles/PMC8360333/ /pubmed/34156966 http://dx.doi.org/10.2196/28876 Text en ©Alessandro Rabiolo, Eugenio Alladio, Esteban Morales, Andrew Ian McNaught, Francesco Bandello, Abdelmonem A Afifi, Alessandro Marchese. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 11.08.2021. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Rabiolo, Alessandro
Alladio, Eugenio
Morales, Esteban
McNaught, Andrew Ian
Bandello, Francesco
Afifi, Abdelmonem A
Marchese, Alessandro
Forecasting the COVID-19 Epidemic by Integrating Symptom Search Behavior Into Predictive Models: Infoveillance Study
title Forecasting the COVID-19 Epidemic by Integrating Symptom Search Behavior Into Predictive Models: Infoveillance Study
title_full Forecasting the COVID-19 Epidemic by Integrating Symptom Search Behavior Into Predictive Models: Infoveillance Study
title_fullStr Forecasting the COVID-19 Epidemic by Integrating Symptom Search Behavior Into Predictive Models: Infoveillance Study
title_full_unstemmed Forecasting the COVID-19 Epidemic by Integrating Symptom Search Behavior Into Predictive Models: Infoveillance Study
title_short Forecasting the COVID-19 Epidemic by Integrating Symptom Search Behavior Into Predictive Models: Infoveillance Study
title_sort forecasting the covid-19 epidemic by integrating symptom search behavior into predictive models: infoveillance study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8360333/
https://www.ncbi.nlm.nih.gov/pubmed/34156966
http://dx.doi.org/10.2196/28876
work_keys_str_mv AT rabioloalessandro forecastingthecovid19epidemicbyintegratingsymptomsearchbehaviorintopredictivemodelsinfoveillancestudy
AT alladioeugenio forecastingthecovid19epidemicbyintegratingsymptomsearchbehaviorintopredictivemodelsinfoveillancestudy
AT moralesesteban forecastingthecovid19epidemicbyintegratingsymptomsearchbehaviorintopredictivemodelsinfoveillancestudy
AT mcnaughtandrewian forecastingthecovid19epidemicbyintegratingsymptomsearchbehaviorintopredictivemodelsinfoveillancestudy
AT bandellofrancesco forecastingthecovid19epidemicbyintegratingsymptomsearchbehaviorintopredictivemodelsinfoveillancestudy
AT afifiabdelmonema forecastingthecovid19epidemicbyintegratingsymptomsearchbehaviorintopredictivemodelsinfoveillancestudy
AT marchesealessandro forecastingthecovid19epidemicbyintegratingsymptomsearchbehaviorintopredictivemodelsinfoveillancestudy