Cargando…

Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA)

Most countries are reopening or considering lifting the stringent prevention policies such as lockdowns, consequently, daily coronavirus disease (COVID-19) cases (confirmed, recovered and deaths) are increasing significantly. As of July 25th, there are 16.5 million global cumulative confirmed cases,...

Descripción completa

Detalles Bibliográficos
Autores principales: ArunKumar, K.E., Kalaga, Dinesh V., Sai Kumar, Ch. Mohan, Chilkoor, Govinda, Kawaji, Masahiro, Brenza, Timothy M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier B.V. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7869631/
https://www.ncbi.nlm.nih.gov/pubmed/33584158
http://dx.doi.org/10.1016/j.asoc.2021.107161
_version_ 1783648667010334720
author ArunKumar, K.E.
Kalaga, Dinesh V.
Sai Kumar, Ch. Mohan
Chilkoor, Govinda
Kawaji, Masahiro
Brenza, Timothy M.
author_facet ArunKumar, K.E.
Kalaga, Dinesh V.
Sai Kumar, Ch. Mohan
Chilkoor, Govinda
Kawaji, Masahiro
Brenza, Timothy M.
author_sort ArunKumar, K.E.
collection PubMed
description Most countries are reopening or considering lifting the stringent prevention policies such as lockdowns, consequently, daily coronavirus disease (COVID-19) cases (confirmed, recovered and deaths) are increasing significantly. As of July 25th, there are 16.5 million global cumulative confirmed cases, 9.4 million cumulative recovered cases and 0.65 million deaths. There is a tremendous necessity of supervising and estimating future COVID-19 cases to control the spread and help countries prepare their healthcare systems. In this study, time-series models — Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA) are used to forecast the epidemiological trends of the COVID-19 pandemic for top-16 countries where 70%–80% of global cumulative cases are located. Initial combinations of the model parameters were selected using the auto-ARIMA model followed by finding the optimized model parameters based on the best fit between the predictions and test data. Analytical tools Auto-Correlation function (ACF), Partial Auto-Correlation Function (PACF), Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) were used to assess the reliability of the models. Evaluation metrics Mean Absolute Error (MAE), Mean Square Error (MSE), Root Mean Square Error (RMSE) and Mean Absolute Percent Error (MAPE) were used as criteria for selecting the best model. A case study was presented where the statistical methodology was discussed in detail for model selection and the procedure for forecasting the COVID-19 cases of the USA. Best model parameters of ARIMA and SARIMA for each country are selected manually and the optimized parameters are then used to forecast the COVID-19 cases. Forecasted trends for confirmed and recovered cases showed an exponential rise for countries such as the United States, Brazil, South Africa, Colombia, Bangladesh, India, Mexico and Pakistan. Similarly, trends for cumulative deaths showed an exponential rise for countries Brazil, South Africa, Chile, Colombia, Bangladesh, India, Mexico, Iran, Peru, and Russia. SARIMA model predictions are more realistic than that of the ARIMA model predictions confirming the existence of seasonality in COVID-19 data. The results of this study not only shed light on the future trends of the COVID-19 outbreak in top-16 countries but also guide these countries to prepare their health care policies for the ongoing pandemic. The data used in this work is obtained from publicly available John Hopkins University’s COVID-19 database.
format Online
Article
Text
id pubmed-7869631
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier B.V.
record_format MEDLINE/PubMed
spelling pubmed-78696312021-02-09 Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA) ArunKumar, K.E. Kalaga, Dinesh V. Sai Kumar, Ch. Mohan Chilkoor, Govinda Kawaji, Masahiro Brenza, Timothy M. Appl Soft Comput Article Most countries are reopening or considering lifting the stringent prevention policies such as lockdowns, consequently, daily coronavirus disease (COVID-19) cases (confirmed, recovered and deaths) are increasing significantly. As of July 25th, there are 16.5 million global cumulative confirmed cases, 9.4 million cumulative recovered cases and 0.65 million deaths. There is a tremendous necessity of supervising and estimating future COVID-19 cases to control the spread and help countries prepare their healthcare systems. In this study, time-series models — Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA) are used to forecast the epidemiological trends of the COVID-19 pandemic for top-16 countries where 70%–80% of global cumulative cases are located. Initial combinations of the model parameters were selected using the auto-ARIMA model followed by finding the optimized model parameters based on the best fit between the predictions and test data. Analytical tools Auto-Correlation function (ACF), Partial Auto-Correlation Function (PACF), Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) were used to assess the reliability of the models. Evaluation metrics Mean Absolute Error (MAE), Mean Square Error (MSE), Root Mean Square Error (RMSE) and Mean Absolute Percent Error (MAPE) were used as criteria for selecting the best model. A case study was presented where the statistical methodology was discussed in detail for model selection and the procedure for forecasting the COVID-19 cases of the USA. Best model parameters of ARIMA and SARIMA for each country are selected manually and the optimized parameters are then used to forecast the COVID-19 cases. Forecasted trends for confirmed and recovered cases showed an exponential rise for countries such as the United States, Brazil, South Africa, Colombia, Bangladesh, India, Mexico and Pakistan. Similarly, trends for cumulative deaths showed an exponential rise for countries Brazil, South Africa, Chile, Colombia, Bangladesh, India, Mexico, Iran, Peru, and Russia. SARIMA model predictions are more realistic than that of the ARIMA model predictions confirming the existence of seasonality in COVID-19 data. The results of this study not only shed light on the future trends of the COVID-19 outbreak in top-16 countries but also guide these countries to prepare their health care policies for the ongoing pandemic. The data used in this work is obtained from publicly available John Hopkins University’s COVID-19 database. Elsevier B.V. 2021-05 2021-02-08 /pmc/articles/PMC7869631/ /pubmed/33584158 http://dx.doi.org/10.1016/j.asoc.2021.107161 Text en © 2021 Elsevier B.V. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
ArunKumar, K.E.
Kalaga, Dinesh V.
Sai Kumar, Ch. Mohan
Chilkoor, Govinda
Kawaji, Masahiro
Brenza, Timothy M.
Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA)
title Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA)
title_full Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA)
title_fullStr Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA)
title_full_unstemmed Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA)
title_short Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA)
title_sort forecasting the dynamics of cumulative covid-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: auto-regressive integrated moving average (arima) and seasonal auto-regressive integrated moving average (sarima)
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7869631/
https://www.ncbi.nlm.nih.gov/pubmed/33584158
http://dx.doi.org/10.1016/j.asoc.2021.107161
work_keys_str_mv AT arunkumarke forecastingthedynamicsofcumulativecovid19casesconfirmedrecoveredanddeathsfortop16countriesusingstatisticalmachinelearningmodelsautoregressiveintegratedmovingaveragearimaandseasonalautoregressiveintegratedmovingaveragesarima
AT kalagadineshv forecastingthedynamicsofcumulativecovid19casesconfirmedrecoveredanddeathsfortop16countriesusingstatisticalmachinelearningmodelsautoregressiveintegratedmovingaveragearimaandseasonalautoregressiveintegratedmovingaveragesarima
AT saikumarchmohan forecastingthedynamicsofcumulativecovid19casesconfirmedrecoveredanddeathsfortop16countriesusingstatisticalmachinelearningmodelsautoregressiveintegratedmovingaveragearimaandseasonalautoregressiveintegratedmovingaveragesarima
AT chilkoorgovinda forecastingthedynamicsofcumulativecovid19casesconfirmedrecoveredanddeathsfortop16countriesusingstatisticalmachinelearningmodelsautoregressiveintegratedmovingaveragearimaandseasonalautoregressiveintegratedmovingaveragesarima
AT kawajimasahiro forecastingthedynamicsofcumulativecovid19casesconfirmedrecoveredanddeathsfortop16countriesusingstatisticalmachinelearningmodelsautoregressiveintegratedmovingaveragearimaandseasonalautoregressiveintegratedmovingaveragesarima
AT brenzatimothym forecastingthedynamicsofcumulativecovid19casesconfirmedrecoveredanddeathsfortop16countriesusingstatisticalmachinelearningmodelsautoregressiveintegratedmovingaveragearimaandseasonalautoregressiveintegratedmovingaveragesarima