Cargando…
Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA)
Most countries are reopening or considering lifting the stringent prevention policies such as lockdowns, consequently, daily coronavirus disease (COVID-19) cases (confirmed, recovered and deaths) are increasing significantly. As of July 25th, there are 16.5 million global cumulative confirmed cases,...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier B.V.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7869631/ https://www.ncbi.nlm.nih.gov/pubmed/33584158 http://dx.doi.org/10.1016/j.asoc.2021.107161 |
_version_ | 1783648667010334720 |
---|---|
author | ArunKumar, K.E. Kalaga, Dinesh V. Sai Kumar, Ch. Mohan Chilkoor, Govinda Kawaji, Masahiro Brenza, Timothy M. |
author_facet | ArunKumar, K.E. Kalaga, Dinesh V. Sai Kumar, Ch. Mohan Chilkoor, Govinda Kawaji, Masahiro Brenza, Timothy M. |
author_sort | ArunKumar, K.E. |
collection | PubMed |
description | Most countries are reopening or considering lifting the stringent prevention policies such as lockdowns, consequently, daily coronavirus disease (COVID-19) cases (confirmed, recovered and deaths) are increasing significantly. As of July 25th, there are 16.5 million global cumulative confirmed cases, 9.4 million cumulative recovered cases and 0.65 million deaths. There is a tremendous necessity of supervising and estimating future COVID-19 cases to control the spread and help countries prepare their healthcare systems. In this study, time-series models — Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA) are used to forecast the epidemiological trends of the COVID-19 pandemic for top-16 countries where 70%–80% of global cumulative cases are located. Initial combinations of the model parameters were selected using the auto-ARIMA model followed by finding the optimized model parameters based on the best fit between the predictions and test data. Analytical tools Auto-Correlation function (ACF), Partial Auto-Correlation Function (PACF), Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) were used to assess the reliability of the models. Evaluation metrics Mean Absolute Error (MAE), Mean Square Error (MSE), Root Mean Square Error (RMSE) and Mean Absolute Percent Error (MAPE) were used as criteria for selecting the best model. A case study was presented where the statistical methodology was discussed in detail for model selection and the procedure for forecasting the COVID-19 cases of the USA. Best model parameters of ARIMA and SARIMA for each country are selected manually and the optimized parameters are then used to forecast the COVID-19 cases. Forecasted trends for confirmed and recovered cases showed an exponential rise for countries such as the United States, Brazil, South Africa, Colombia, Bangladesh, India, Mexico and Pakistan. Similarly, trends for cumulative deaths showed an exponential rise for countries Brazil, South Africa, Chile, Colombia, Bangladesh, India, Mexico, Iran, Peru, and Russia. SARIMA model predictions are more realistic than that of the ARIMA model predictions confirming the existence of seasonality in COVID-19 data. The results of this study not only shed light on the future trends of the COVID-19 outbreak in top-16 countries but also guide these countries to prepare their health care policies for the ongoing pandemic. The data used in this work is obtained from publicly available John Hopkins University’s COVID-19 database. |
format | Online Article Text |
id | pubmed-7869631 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Elsevier B.V. |
record_format | MEDLINE/PubMed |
spelling | pubmed-78696312021-02-09 Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA) ArunKumar, K.E. Kalaga, Dinesh V. Sai Kumar, Ch. Mohan Chilkoor, Govinda Kawaji, Masahiro Brenza, Timothy M. Appl Soft Comput Article Most countries are reopening or considering lifting the stringent prevention policies such as lockdowns, consequently, daily coronavirus disease (COVID-19) cases (confirmed, recovered and deaths) are increasing significantly. As of July 25th, there are 16.5 million global cumulative confirmed cases, 9.4 million cumulative recovered cases and 0.65 million deaths. There is a tremendous necessity of supervising and estimating future COVID-19 cases to control the spread and help countries prepare their healthcare systems. In this study, time-series models — Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA) are used to forecast the epidemiological trends of the COVID-19 pandemic for top-16 countries where 70%–80% of global cumulative cases are located. Initial combinations of the model parameters were selected using the auto-ARIMA model followed by finding the optimized model parameters based on the best fit between the predictions and test data. Analytical tools Auto-Correlation function (ACF), Partial Auto-Correlation Function (PACF), Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) were used to assess the reliability of the models. Evaluation metrics Mean Absolute Error (MAE), Mean Square Error (MSE), Root Mean Square Error (RMSE) and Mean Absolute Percent Error (MAPE) were used as criteria for selecting the best model. A case study was presented where the statistical methodology was discussed in detail for model selection and the procedure for forecasting the COVID-19 cases of the USA. Best model parameters of ARIMA and SARIMA for each country are selected manually and the optimized parameters are then used to forecast the COVID-19 cases. Forecasted trends for confirmed and recovered cases showed an exponential rise for countries such as the United States, Brazil, South Africa, Colombia, Bangladesh, India, Mexico and Pakistan. Similarly, trends for cumulative deaths showed an exponential rise for countries Brazil, South Africa, Chile, Colombia, Bangladesh, India, Mexico, Iran, Peru, and Russia. SARIMA model predictions are more realistic than that of the ARIMA model predictions confirming the existence of seasonality in COVID-19 data. The results of this study not only shed light on the future trends of the COVID-19 outbreak in top-16 countries but also guide these countries to prepare their health care policies for the ongoing pandemic. The data used in this work is obtained from publicly available John Hopkins University’s COVID-19 database. Elsevier B.V. 2021-05 2021-02-08 /pmc/articles/PMC7869631/ /pubmed/33584158 http://dx.doi.org/10.1016/j.asoc.2021.107161 Text en © 2021 Elsevier B.V. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article ArunKumar, K.E. Kalaga, Dinesh V. Sai Kumar, Ch. Mohan Chilkoor, Govinda Kawaji, Masahiro Brenza, Timothy M. Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA) |
title | Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA) |
title_full | Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA) |
title_fullStr | Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA) |
title_full_unstemmed | Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA) |
title_short | Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA) |
title_sort | forecasting the dynamics of cumulative covid-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: auto-regressive integrated moving average (arima) and seasonal auto-regressive integrated moving average (sarima) |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7869631/ https://www.ncbi.nlm.nih.gov/pubmed/33584158 http://dx.doi.org/10.1016/j.asoc.2021.107161 |
work_keys_str_mv | AT arunkumarke forecastingthedynamicsofcumulativecovid19casesconfirmedrecoveredanddeathsfortop16countriesusingstatisticalmachinelearningmodelsautoregressiveintegratedmovingaveragearimaandseasonalautoregressiveintegratedmovingaveragesarima AT kalagadineshv forecastingthedynamicsofcumulativecovid19casesconfirmedrecoveredanddeathsfortop16countriesusingstatisticalmachinelearningmodelsautoregressiveintegratedmovingaveragearimaandseasonalautoregressiveintegratedmovingaveragesarima AT saikumarchmohan forecastingthedynamicsofcumulativecovid19casesconfirmedrecoveredanddeathsfortop16countriesusingstatisticalmachinelearningmodelsautoregressiveintegratedmovingaveragearimaandseasonalautoregressiveintegratedmovingaveragesarima AT chilkoorgovinda forecastingthedynamicsofcumulativecovid19casesconfirmedrecoveredanddeathsfortop16countriesusingstatisticalmachinelearningmodelsautoregressiveintegratedmovingaveragearimaandseasonalautoregressiveintegratedmovingaveragesarima AT kawajimasahiro forecastingthedynamicsofcumulativecovid19casesconfirmedrecoveredanddeathsfortop16countriesusingstatisticalmachinelearningmodelsautoregressiveintegratedmovingaveragearimaandseasonalautoregressiveintegratedmovingaveragesarima AT brenzatimothym forecastingthedynamicsofcumulativecovid19casesconfirmedrecoveredanddeathsfortop16countriesusingstatisticalmachinelearningmodelsautoregressiveintegratedmovingaveragearimaandseasonalautoregressiveintegratedmovingaveragesarima |