Cargando…
Application of a data-driven XGBoost model for the prediction of COVID-19 in the USA: a time-series study
OBJECTIVE: The COVID-19 outbreak was first reported in Wuhan, China, and has been acknowledged as a pandemic due to its rapid spread worldwide. Predicting the trend of COVID-19 is of great significance for its prevention. A comparison between the autoregressive integrated moving average (ARIMA) mode...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BMJ Publishing Group
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9251895/ https://www.ncbi.nlm.nih.gov/pubmed/35777884 http://dx.doi.org/10.1136/bmjopen-2021-056685 |
_version_ | 1784740133944164352 |
---|---|
author | Fang, Zheng-gang Yang, Shu-qin Lv, Cai-xia An, Shu-yi Wu, Wei |
author_facet | Fang, Zheng-gang Yang, Shu-qin Lv, Cai-xia An, Shu-yi Wu, Wei |
author_sort | Fang, Zheng-gang |
collection | PubMed |
description | OBJECTIVE: The COVID-19 outbreak was first reported in Wuhan, China, and has been acknowledged as a pandemic due to its rapid spread worldwide. Predicting the trend of COVID-19 is of great significance for its prevention. A comparison between the autoregressive integrated moving average (ARIMA) model and the eXtreme Gradient Boosting (XGBoost) model was conducted to determine which was more accurate for anticipating the occurrence of COVID-19 in the USA. DESIGN: Time-series study. SETTING: The USA was the setting for this study. MAIN OUTCOME MEASURES: Three accuracy metrics, mean absolute error (MAE), root mean square error (RMSE) and mean absolute percentage error (MAPE), were applied to evaluate the performance of the two models. RESULTS: In our study, for the training set and the validation set, the MAE, RMSE and MAPE of the XGBoost model were less than those of the ARIMA model. CONCLUSIONS: The XGBoost model can help improve prediction of COVID-19 cases in the USA over the ARIMA model. |
format | Online Article Text |
id | pubmed-9251895 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BMJ Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-92518952022-07-05 Application of a data-driven XGBoost model for the prediction of COVID-19 in the USA: a time-series study Fang, Zheng-gang Yang, Shu-qin Lv, Cai-xia An, Shu-yi Wu, Wei BMJ Open Epidemiology OBJECTIVE: The COVID-19 outbreak was first reported in Wuhan, China, and has been acknowledged as a pandemic due to its rapid spread worldwide. Predicting the trend of COVID-19 is of great significance for its prevention. A comparison between the autoregressive integrated moving average (ARIMA) model and the eXtreme Gradient Boosting (XGBoost) model was conducted to determine which was more accurate for anticipating the occurrence of COVID-19 in the USA. DESIGN: Time-series study. SETTING: The USA was the setting for this study. MAIN OUTCOME MEASURES: Three accuracy metrics, mean absolute error (MAE), root mean square error (RMSE) and mean absolute percentage error (MAPE), were applied to evaluate the performance of the two models. RESULTS: In our study, for the training set and the validation set, the MAE, RMSE and MAPE of the XGBoost model were less than those of the ARIMA model. CONCLUSIONS: The XGBoost model can help improve prediction of COVID-19 cases in the USA over the ARIMA model. BMJ Publishing Group 2022-07-01 /pmc/articles/PMC9251895/ /pubmed/35777884 http://dx.doi.org/10.1136/bmjopen-2021-056685 Text en © Author(s) (or their employer(s)) 2022. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) . |
spellingShingle | Epidemiology Fang, Zheng-gang Yang, Shu-qin Lv, Cai-xia An, Shu-yi Wu, Wei Application of a data-driven XGBoost model for the prediction of COVID-19 in the USA: a time-series study |
title | Application of a data-driven XGBoost model for the prediction of COVID-19 in the USA: a time-series study |
title_full | Application of a data-driven XGBoost model for the prediction of COVID-19 in the USA: a time-series study |
title_fullStr | Application of a data-driven XGBoost model for the prediction of COVID-19 in the USA: a time-series study |
title_full_unstemmed | Application of a data-driven XGBoost model for the prediction of COVID-19 in the USA: a time-series study |
title_short | Application of a data-driven XGBoost model for the prediction of COVID-19 in the USA: a time-series study |
title_sort | application of a data-driven xgboost model for the prediction of covid-19 in the usa: a time-series study |
topic | Epidemiology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9251895/ https://www.ncbi.nlm.nih.gov/pubmed/35777884 http://dx.doi.org/10.1136/bmjopen-2021-056685 |
work_keys_str_mv | AT fangzhenggang applicationofadatadrivenxgboostmodelforthepredictionofcovid19intheusaatimeseriesstudy AT yangshuqin applicationofadatadrivenxgboostmodelforthepredictionofcovid19intheusaatimeseriesstudy AT lvcaixia applicationofadatadrivenxgboostmodelforthepredictionofcovid19intheusaatimeseriesstudy AT anshuyi applicationofadatadrivenxgboostmodelforthepredictionofcovid19intheusaatimeseriesstudy AT wuwei applicationofadatadrivenxgboostmodelforthepredictionofcovid19intheusaatimeseriesstudy |