Cargando…

Forecasting SARS-CoV-2 Incidence in Ontario Municipalities with Statistical and Algorithmic Modeling and Ensembles

PURPOSE: In this study, a variety of statistical and algorithmic models were applied to forecast Covid-19 incidence in two Canadian cities, Wellington-Dufferin-Guelph (WDG) and Toronto, Ontario. The purpose of forecasting incidence in the two cities was to explore and compare the predictive capacity...

Descripción completa

Detalles Bibliográficos
Autores principales: Orang, A., Berke, O., Ng, V., Rees, E., Poljak, Z., Greer, A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Published by Elsevier Ltd. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8884825/
http://dx.doi.org/10.1016/j.ijid.2021.12.262
_version_ 1784660252315090944
author Orang, A.
Berke, O.
Ng, V.
Rees, E.
Poljak, Z.
Greer, A.
author_facet Orang, A.
Berke, O.
Ng, V.
Rees, E.
Poljak, Z.
Greer, A.
author_sort Orang, A.
collection PubMed
description PURPOSE: In this study, a variety of statistical and algorithmic models were applied to forecast Covid-19 incidence in two Canadian cities, Wellington-Dufferin-Guelph (WDG) and Toronto, Ontario. The purpose of forecasting incidence in the two cities was to explore and compare the predictive capacity of each approach in two regions where daily incidences differ due to population sizes, thus requiring different analytical approaches to inform public health. METHODS & MATERIALS: The dataset consisted of daily Covid-19 incidence within WDG and Toronto, Ontario. Data was split into training data (March 13, 2020, to June 17, 2021) and validation data (June 18, 2021, to July 8, 2021). Models fitted to the training data were assessed on validation data. Additionally, the effective reproductive number (Re), holidays, type of variant (i.e., Alpha, Beta, Gamma, Delta), mutation common to a variant detected or no mutation detected as well as the cumulative number of first and second vaccination doses were included as predictors. Statistical models employed were General Linear Autoregressive Moving Average (GLARMA), Seasonal Autoregressive Integrated Moving Average (SARIMA) and Regression with ARIMA errors. The two machine learning algorithms were Neural Network Autoregression (NNAR) and Random Forest (RF). A hybrid model combining the statistical and algorithmic approaches (ARIMA-Boosted) was also explored. Ensembles combining several of the models were then generated to investigate improvement in predictive performance. Performance was assessed via Root Mean Square Prediction Error (RMSE) and Mean Absolute Scale Prediction Error (MASE). RESULTS: In WDG, regression with ARIMA achieved respectable forecast accuracy (RMSE = 3.50, MASE = 0.71). Ensembles provided a marginal gain in forecast accuracy (RMSE = 3.48, MASE = 0.67) In Toronto, SARIMA modeling had the superior forecasts (RMSE = 8.14, MASE = 0.52), whereas ensembles did not improve accuracy (RMSE = 8.57, MASE = 0.58). CONCLUSION: Models based on observed associations (i.e., statistical modeling) provided more accurate forecasts than data driven algorithmic modeling (i.e., machine learning) for forecasting epidemic/pandemic trajectory. This finding was consistent in both WDG and Toronto, Ontario. While ensemble forecasts may slightly improve the forecast accuracy, the computational expense did not justify its application in the current examples.
format Online
Article
Text
id pubmed-8884825
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Published by Elsevier Ltd.
record_format MEDLINE/PubMed
spelling pubmed-88848252022-03-01 Forecasting SARS-CoV-2 Incidence in Ontario Municipalities with Statistical and Algorithmic Modeling and Ensembles Orang, A. Berke, O. Ng, V. Rees, E. Poljak, Z. Greer, A. Int J Infect Dis Ps25.04 (1189) PURPOSE: In this study, a variety of statistical and algorithmic models were applied to forecast Covid-19 incidence in two Canadian cities, Wellington-Dufferin-Guelph (WDG) and Toronto, Ontario. The purpose of forecasting incidence in the two cities was to explore and compare the predictive capacity of each approach in two regions where daily incidences differ due to population sizes, thus requiring different analytical approaches to inform public health. METHODS & MATERIALS: The dataset consisted of daily Covid-19 incidence within WDG and Toronto, Ontario. Data was split into training data (March 13, 2020, to June 17, 2021) and validation data (June 18, 2021, to July 8, 2021). Models fitted to the training data were assessed on validation data. Additionally, the effective reproductive number (Re), holidays, type of variant (i.e., Alpha, Beta, Gamma, Delta), mutation common to a variant detected or no mutation detected as well as the cumulative number of first and second vaccination doses were included as predictors. Statistical models employed were General Linear Autoregressive Moving Average (GLARMA), Seasonal Autoregressive Integrated Moving Average (SARIMA) and Regression with ARIMA errors. The two machine learning algorithms were Neural Network Autoregression (NNAR) and Random Forest (RF). A hybrid model combining the statistical and algorithmic approaches (ARIMA-Boosted) was also explored. Ensembles combining several of the models were then generated to investigate improvement in predictive performance. Performance was assessed via Root Mean Square Prediction Error (RMSE) and Mean Absolute Scale Prediction Error (MASE). RESULTS: In WDG, regression with ARIMA achieved respectable forecast accuracy (RMSE = 3.50, MASE = 0.71). Ensembles provided a marginal gain in forecast accuracy (RMSE = 3.48, MASE = 0.67) In Toronto, SARIMA modeling had the superior forecasts (RMSE = 8.14, MASE = 0.52), whereas ensembles did not improve accuracy (RMSE = 8.57, MASE = 0.58). CONCLUSION: Models based on observed associations (i.e., statistical modeling) provided more accurate forecasts than data driven algorithmic modeling (i.e., machine learning) for forecasting epidemic/pandemic trajectory. This finding was consistent in both WDG and Toronto, Ontario. While ensemble forecasts may slightly improve the forecast accuracy, the computational expense did not justify its application in the current examples. Published by Elsevier Ltd. 2022-03 2022-02-28 /pmc/articles/PMC8884825/ http://dx.doi.org/10.1016/j.ijid.2021.12.262 Text en Copyright © 2021 Published by Elsevier Ltd. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Ps25.04 (1189)
Orang, A.
Berke, O.
Ng, V.
Rees, E.
Poljak, Z.
Greer, A.
Forecasting SARS-CoV-2 Incidence in Ontario Municipalities with Statistical and Algorithmic Modeling and Ensembles
title Forecasting SARS-CoV-2 Incidence in Ontario Municipalities with Statistical and Algorithmic Modeling and Ensembles
title_full Forecasting SARS-CoV-2 Incidence in Ontario Municipalities with Statistical and Algorithmic Modeling and Ensembles
title_fullStr Forecasting SARS-CoV-2 Incidence in Ontario Municipalities with Statistical and Algorithmic Modeling and Ensembles
title_full_unstemmed Forecasting SARS-CoV-2 Incidence in Ontario Municipalities with Statistical and Algorithmic Modeling and Ensembles
title_short Forecasting SARS-CoV-2 Incidence in Ontario Municipalities with Statistical and Algorithmic Modeling and Ensembles
title_sort forecasting sars-cov-2 incidence in ontario municipalities with statistical and algorithmic modeling and ensembles
topic Ps25.04 (1189)
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8884825/
http://dx.doi.org/10.1016/j.ijid.2021.12.262
work_keys_str_mv AT oranga forecastingsarscov2incidenceinontariomunicipalitieswithstatisticalandalgorithmicmodelingandensembles
AT berkeo forecastingsarscov2incidenceinontariomunicipalitieswithstatisticalandalgorithmicmodelingandensembles
AT ngv forecastingsarscov2incidenceinontariomunicipalitieswithstatisticalandalgorithmicmodelingandensembles
AT reese forecastingsarscov2incidenceinontariomunicipalitieswithstatisticalandalgorithmicmodelingandensembles
AT poljakz forecastingsarscov2incidenceinontariomunicipalitieswithstatisticalandalgorithmicmodelingandensembles
AT greera forecastingsarscov2incidenceinontariomunicipalitieswithstatisticalandalgorithmicmodelingandensembles