Cargando…

Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh

Accurate predictive time series modelling is important in public health planning and response during the emergence of a novel pandemic. Therefore, the aims of the study are three-fold: (a) to model the overall trend of COVID-19 confirmed cases and deaths in Bangladesh; (b) to generate a short-term f...

Descripción completa

Detalles Bibliográficos
Autores principales: Rahman, Md. Siddikur, Chowdhury, Arman Hossain, Amrin, Miftahuzzannat
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10021465/
https://www.ncbi.nlm.nih.gov/pubmed/36962227
http://dx.doi.org/10.1371/journal.pgph.0000495
_version_ 1784908491854446592
author Rahman, Md. Siddikur
Chowdhury, Arman Hossain
Amrin, Miftahuzzannat
author_facet Rahman, Md. Siddikur
Chowdhury, Arman Hossain
Amrin, Miftahuzzannat
author_sort Rahman, Md. Siddikur
collection PubMed
description Accurate predictive time series modelling is important in public health planning and response during the emergence of a novel pandemic. Therefore, the aims of the study are three-fold: (a) to model the overall trend of COVID-19 confirmed cases and deaths in Bangladesh; (b) to generate a short-term forecast of 8 weeks of COVID-19 cases and deaths; (c) to compare the predictive accuracy of the Autoregressive Integrated Moving Average (ARIMA) and eXtreme Gradient Boosting (XGBoost) for precise modelling of non-linear features and seasonal trends of the time series. The data were collected from the onset of the epidemic in Bangladesh from the Directorate General of Health Service (DGHS) and Institute of Epidemiology, Disease Control and Research (IEDCR). The daily confirmed cases and deaths of COVID-19 of 633 days in Bangladesh were divided into several training and test sets. The ARIMA and XGBoost models were established using those training data, and the test sets were used to evaluate each model’s ability to forecast and finally averaged all the predictive performances to choose the best model. The predictive accuracy of the models was assessed using the mean absolute error (MAE), mean percentage error (MPE), root mean square error (RMSE) and mean absolute percentage error (MAPE). The findings reveal the existence of a nonlinear trend and weekly seasonality in the dataset. The average error measures of the ARIMA model for both COVID-19 confirmed cases and deaths were lower than XGBoost model. Hence, in our study, the ARIMA model performed better than the XGBoost model in predicting COVID-19 confirmed cases and deaths in Bangladesh. The suggested prediction model might play a critical role in estimating the spread of a novel pandemic in Bangladesh and similar countries.
format Online
Article
Text
id pubmed-10021465
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-100214652023-03-17 Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh Rahman, Md. Siddikur Chowdhury, Arman Hossain Amrin, Miftahuzzannat PLOS Glob Public Health Research Article Accurate predictive time series modelling is important in public health planning and response during the emergence of a novel pandemic. Therefore, the aims of the study are three-fold: (a) to model the overall trend of COVID-19 confirmed cases and deaths in Bangladesh; (b) to generate a short-term forecast of 8 weeks of COVID-19 cases and deaths; (c) to compare the predictive accuracy of the Autoregressive Integrated Moving Average (ARIMA) and eXtreme Gradient Boosting (XGBoost) for precise modelling of non-linear features and seasonal trends of the time series. The data were collected from the onset of the epidemic in Bangladesh from the Directorate General of Health Service (DGHS) and Institute of Epidemiology, Disease Control and Research (IEDCR). The daily confirmed cases and deaths of COVID-19 of 633 days in Bangladesh were divided into several training and test sets. The ARIMA and XGBoost models were established using those training data, and the test sets were used to evaluate each model’s ability to forecast and finally averaged all the predictive performances to choose the best model. The predictive accuracy of the models was assessed using the mean absolute error (MAE), mean percentage error (MPE), root mean square error (RMSE) and mean absolute percentage error (MAPE). The findings reveal the existence of a nonlinear trend and weekly seasonality in the dataset. The average error measures of the ARIMA model for both COVID-19 confirmed cases and deaths were lower than XGBoost model. Hence, in our study, the ARIMA model performed better than the XGBoost model in predicting COVID-19 confirmed cases and deaths in Bangladesh. The suggested prediction model might play a critical role in estimating the spread of a novel pandemic in Bangladesh and similar countries. Public Library of Science 2022-05-18 /pmc/articles/PMC10021465/ /pubmed/36962227 http://dx.doi.org/10.1371/journal.pgph.0000495 Text en © 2022 Rahman et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Rahman, Md. Siddikur
Chowdhury, Arman Hossain
Amrin, Miftahuzzannat
Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh
title Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh
title_full Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh
title_fullStr Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh
title_full_unstemmed Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh
title_short Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh
title_sort accuracy comparison of arima and xgboost forecasting models in predicting the incidence of covid-19 in bangladesh
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10021465/
https://www.ncbi.nlm.nih.gov/pubmed/36962227
http://dx.doi.org/10.1371/journal.pgph.0000495
work_keys_str_mv AT rahmanmdsiddikur accuracycomparisonofarimaandxgboostforecastingmodelsinpredictingtheincidenceofcovid19inbangladesh
AT chowdhuryarmanhossain accuracycomparisonofarimaandxgboostforecastingmodelsinpredictingtheincidenceofcovid19inbangladesh
AT amrinmiftahuzzannat accuracycomparisonofarimaandxgboostforecastingmodelsinpredictingtheincidenceofcovid19inbangladesh