Cargando…
Forecasting SARS-CoV-2 Incidence in Ontario Municipalities with Statistical and Algorithmic Modeling and Ensembles
PURPOSE: In this study, a variety of statistical and algorithmic models were applied to forecast Covid-19 incidence in two Canadian cities, Wellington-Dufferin-Guelph (WDG) and Toronto, Ontario. The purpose of forecasting incidence in the two cities was to explore and compare the predictive capacity...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Published by Elsevier Ltd.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8884825/ http://dx.doi.org/10.1016/j.ijid.2021.12.262 |
_version_ | 1784660252315090944 |
---|---|
author | Orang, A. Berke, O. Ng, V. Rees, E. Poljak, Z. Greer, A. |
author_facet | Orang, A. Berke, O. Ng, V. Rees, E. Poljak, Z. Greer, A. |
author_sort | Orang, A. |
collection | PubMed |
description | PURPOSE: In this study, a variety of statistical and algorithmic models were applied to forecast Covid-19 incidence in two Canadian cities, Wellington-Dufferin-Guelph (WDG) and Toronto, Ontario. The purpose of forecasting incidence in the two cities was to explore and compare the predictive capacity of each approach in two regions where daily incidences differ due to population sizes, thus requiring different analytical approaches to inform public health. METHODS & MATERIALS: The dataset consisted of daily Covid-19 incidence within WDG and Toronto, Ontario. Data was split into training data (March 13, 2020, to June 17, 2021) and validation data (June 18, 2021, to July 8, 2021). Models fitted to the training data were assessed on validation data. Additionally, the effective reproductive number (Re), holidays, type of variant (i.e., Alpha, Beta, Gamma, Delta), mutation common to a variant detected or no mutation detected as well as the cumulative number of first and second vaccination doses were included as predictors. Statistical models employed were General Linear Autoregressive Moving Average (GLARMA), Seasonal Autoregressive Integrated Moving Average (SARIMA) and Regression with ARIMA errors. The two machine learning algorithms were Neural Network Autoregression (NNAR) and Random Forest (RF). A hybrid model combining the statistical and algorithmic approaches (ARIMA-Boosted) was also explored. Ensembles combining several of the models were then generated to investigate improvement in predictive performance. Performance was assessed via Root Mean Square Prediction Error (RMSE) and Mean Absolute Scale Prediction Error (MASE). RESULTS: In WDG, regression with ARIMA achieved respectable forecast accuracy (RMSE = 3.50, MASE = 0.71). Ensembles provided a marginal gain in forecast accuracy (RMSE = 3.48, MASE = 0.67) In Toronto, SARIMA modeling had the superior forecasts (RMSE = 8.14, MASE = 0.52), whereas ensembles did not improve accuracy (RMSE = 8.57, MASE = 0.58). CONCLUSION: Models based on observed associations (i.e., statistical modeling) provided more accurate forecasts than data driven algorithmic modeling (i.e., machine learning) for forecasting epidemic/pandemic trajectory. This finding was consistent in both WDG and Toronto, Ontario. While ensemble forecasts may slightly improve the forecast accuracy, the computational expense did not justify its application in the current examples. |
format | Online Article Text |
id | pubmed-8884825 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Published by Elsevier Ltd. |
record_format | MEDLINE/PubMed |
spelling | pubmed-88848252022-03-01 Forecasting SARS-CoV-2 Incidence in Ontario Municipalities with Statistical and Algorithmic Modeling and Ensembles Orang, A. Berke, O. Ng, V. Rees, E. Poljak, Z. Greer, A. Int J Infect Dis Ps25.04 (1189) PURPOSE: In this study, a variety of statistical and algorithmic models were applied to forecast Covid-19 incidence in two Canadian cities, Wellington-Dufferin-Guelph (WDG) and Toronto, Ontario. The purpose of forecasting incidence in the two cities was to explore and compare the predictive capacity of each approach in two regions where daily incidences differ due to population sizes, thus requiring different analytical approaches to inform public health. METHODS & MATERIALS: The dataset consisted of daily Covid-19 incidence within WDG and Toronto, Ontario. Data was split into training data (March 13, 2020, to June 17, 2021) and validation data (June 18, 2021, to July 8, 2021). Models fitted to the training data were assessed on validation data. Additionally, the effective reproductive number (Re), holidays, type of variant (i.e., Alpha, Beta, Gamma, Delta), mutation common to a variant detected or no mutation detected as well as the cumulative number of first and second vaccination doses were included as predictors. Statistical models employed were General Linear Autoregressive Moving Average (GLARMA), Seasonal Autoregressive Integrated Moving Average (SARIMA) and Regression with ARIMA errors. The two machine learning algorithms were Neural Network Autoregression (NNAR) and Random Forest (RF). A hybrid model combining the statistical and algorithmic approaches (ARIMA-Boosted) was also explored. Ensembles combining several of the models were then generated to investigate improvement in predictive performance. Performance was assessed via Root Mean Square Prediction Error (RMSE) and Mean Absolute Scale Prediction Error (MASE). RESULTS: In WDG, regression with ARIMA achieved respectable forecast accuracy (RMSE = 3.50, MASE = 0.71). Ensembles provided a marginal gain in forecast accuracy (RMSE = 3.48, MASE = 0.67) In Toronto, SARIMA modeling had the superior forecasts (RMSE = 8.14, MASE = 0.52), whereas ensembles did not improve accuracy (RMSE = 8.57, MASE = 0.58). CONCLUSION: Models based on observed associations (i.e., statistical modeling) provided more accurate forecasts than data driven algorithmic modeling (i.e., machine learning) for forecasting epidemic/pandemic trajectory. This finding was consistent in both WDG and Toronto, Ontario. While ensemble forecasts may slightly improve the forecast accuracy, the computational expense did not justify its application in the current examples. Published by Elsevier Ltd. 2022-03 2022-02-28 /pmc/articles/PMC8884825/ http://dx.doi.org/10.1016/j.ijid.2021.12.262 Text en Copyright © 2021 Published by Elsevier Ltd. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Ps25.04 (1189) Orang, A. Berke, O. Ng, V. Rees, E. Poljak, Z. Greer, A. Forecasting SARS-CoV-2 Incidence in Ontario Municipalities with Statistical and Algorithmic Modeling and Ensembles |
title | Forecasting SARS-CoV-2 Incidence in Ontario Municipalities with Statistical and Algorithmic Modeling and Ensembles |
title_full | Forecasting SARS-CoV-2 Incidence in Ontario Municipalities with Statistical and Algorithmic Modeling and Ensembles |
title_fullStr | Forecasting SARS-CoV-2 Incidence in Ontario Municipalities with Statistical and Algorithmic Modeling and Ensembles |
title_full_unstemmed | Forecasting SARS-CoV-2 Incidence in Ontario Municipalities with Statistical and Algorithmic Modeling and Ensembles |
title_short | Forecasting SARS-CoV-2 Incidence in Ontario Municipalities with Statistical and Algorithmic Modeling and Ensembles |
title_sort | forecasting sars-cov-2 incidence in ontario municipalities with statistical and algorithmic modeling and ensembles |
topic | Ps25.04 (1189) |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8884825/ http://dx.doi.org/10.1016/j.ijid.2021.12.262 |
work_keys_str_mv | AT oranga forecastingsarscov2incidenceinontariomunicipalitieswithstatisticalandalgorithmicmodelingandensembles AT berkeo forecastingsarscov2incidenceinontariomunicipalitieswithstatisticalandalgorithmicmodelingandensembles AT ngv forecastingsarscov2incidenceinontariomunicipalitieswithstatisticalandalgorithmicmodelingandensembles AT reese forecastingsarscov2incidenceinontariomunicipalitieswithstatisticalandalgorithmicmodelingandensembles AT poljakz forecastingsarscov2incidenceinontariomunicipalitieswithstatisticalandalgorithmicmodelingandensembles AT greera forecastingsarscov2incidenceinontariomunicipalitieswithstatisticalandalgorithmicmodelingandensembles |