Cargando…

Application of a data-driven XGBoost model for the prediction of COVID-19 in the USA: a time-series study

OBJECTIVE: The COVID-19 outbreak was first reported in Wuhan, China, and has been acknowledged as a pandemic due to its rapid spread worldwide. Predicting the trend of COVID-19 is of great significance for its prevention. A comparison between the autoregressive integrated moving average (ARIMA) mode...

Descripción completa

Detalles Bibliográficos
Autores principales: Fang, Zheng-gang, Yang, Shu-qin, Lv, Cai-xia, An, Shu-yi, Wu, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9251895/
https://www.ncbi.nlm.nih.gov/pubmed/35777884
http://dx.doi.org/10.1136/bmjopen-2021-056685
Descripción
Sumario:OBJECTIVE: The COVID-19 outbreak was first reported in Wuhan, China, and has been acknowledged as a pandemic due to its rapid spread worldwide. Predicting the trend of COVID-19 is of great significance for its prevention. A comparison between the autoregressive integrated moving average (ARIMA) model and the eXtreme Gradient Boosting (XGBoost) model was conducted to determine which was more accurate for anticipating the occurrence of COVID-19 in the USA. DESIGN: Time-series study. SETTING: The USA was the setting for this study. MAIN OUTCOME MEASURES: Three accuracy metrics, mean absolute error (MAE), root mean square error (RMSE) and mean absolute percentage error (MAPE), were applied to evaluate the performance of the two models. RESULTS: In our study, for the training set and the validation set, the MAE, RMSE and MAPE of the XGBoost model were less than those of the ARIMA model. CONCLUSIONS: The XGBoost model can help improve prediction of COVID-19 cases in the USA over the ARIMA model.