Cargando…
Comparison of ARIMA model and XGBoost model for prediction of human brucellosis in mainland China: a time-series study
OBJECTIVES: Human brucellosis is a public health problem endangering health and property in China. Predicting the trend and the seasonality of human brucellosis is of great significance for its prevention. In this study, a comparison between the autoregressive integrated moving average (ARIMA) model...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BMJ Publishing Group
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7722837/ https://www.ncbi.nlm.nih.gov/pubmed/33293308 http://dx.doi.org/10.1136/bmjopen-2020-039676 |
_version_ | 1783620231841710080 |
---|---|
author | Alim, Mirxat Ye, Guo-Hua Guan, Peng Huang, De-Sheng Zhou, Bao-Sen Wu, Wei |
author_facet | Alim, Mirxat Ye, Guo-Hua Guan, Peng Huang, De-Sheng Zhou, Bao-Sen Wu, Wei |
author_sort | Alim, Mirxat |
collection | PubMed |
description | OBJECTIVES: Human brucellosis is a public health problem endangering health and property in China. Predicting the trend and the seasonality of human brucellosis is of great significance for its prevention. In this study, a comparison between the autoregressive integrated moving average (ARIMA) model and the eXtreme Gradient Boosting (XGBoost) model was conducted to determine which was more suitable for predicting the occurrence of brucellosis in mainland China. DESIGN: Time-series study. SETTING: Mainland China. METHODS: Data on human brucellosis in mainland China were provided by the National Health and Family Planning Commission of China. The data were divided into a training set and a test set. The training set was composed of the monthly incidence of human brucellosis in mainland China from January 2008 to June 2018, and the test set was composed of the monthly incidence from July 2018 to June 2019. The mean absolute error (MAE), root mean square error (RMSE) and mean absolute percentage error (MAPE) were used to evaluate the effects of model fitting and prediction. RESULTS: The number of human brucellosis patients in mainland China increased from 30 002 in 2008 to 40 328 in 2018. There was an increasing trend and obvious seasonal distribution in the original time series. For the training set, the MAE, RSME and MAPE of the ARIMA(0,1,1)×(0,1,1)(12) model were 338.867, 450.223 and 10.323, respectively, and the MAE, RSME and MAPE of the XGBoost model were 189.332, 262.458 and 4.475, respectively. For the test set, the MAE, RSME and MAPE of the ARIMA(0,1,1)×(0,1,1)(12) model were 529.406, 586.059 and 17.676, respectively, and the MAE, RSME and MAPE of the XGBoost model were 249.307, 280.645 and 7.643, respectively. CONCLUSIONS: The performance of the XGBoost model was better than that of the ARIMA model. The XGBoost model is more suitable for prediction cases of human brucellosis in mainland China. |
format | Online Article Text |
id | pubmed-7722837 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BMJ Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-77228372020-12-14 Comparison of ARIMA model and XGBoost model for prediction of human brucellosis in mainland China: a time-series study Alim, Mirxat Ye, Guo-Hua Guan, Peng Huang, De-Sheng Zhou, Bao-Sen Wu, Wei BMJ Open Epidemiology OBJECTIVES: Human brucellosis is a public health problem endangering health and property in China. Predicting the trend and the seasonality of human brucellosis is of great significance for its prevention. In this study, a comparison between the autoregressive integrated moving average (ARIMA) model and the eXtreme Gradient Boosting (XGBoost) model was conducted to determine which was more suitable for predicting the occurrence of brucellosis in mainland China. DESIGN: Time-series study. SETTING: Mainland China. METHODS: Data on human brucellosis in mainland China were provided by the National Health and Family Planning Commission of China. The data were divided into a training set and a test set. The training set was composed of the monthly incidence of human brucellosis in mainland China from January 2008 to June 2018, and the test set was composed of the monthly incidence from July 2018 to June 2019. The mean absolute error (MAE), root mean square error (RMSE) and mean absolute percentage error (MAPE) were used to evaluate the effects of model fitting and prediction. RESULTS: The number of human brucellosis patients in mainland China increased from 30 002 in 2008 to 40 328 in 2018. There was an increasing trend and obvious seasonal distribution in the original time series. For the training set, the MAE, RSME and MAPE of the ARIMA(0,1,1)×(0,1,1)(12) model were 338.867, 450.223 and 10.323, respectively, and the MAE, RSME and MAPE of the XGBoost model were 189.332, 262.458 and 4.475, respectively. For the test set, the MAE, RSME and MAPE of the ARIMA(0,1,1)×(0,1,1)(12) model were 529.406, 586.059 and 17.676, respectively, and the MAE, RSME and MAPE of the XGBoost model were 249.307, 280.645 and 7.643, respectively. CONCLUSIONS: The performance of the XGBoost model was better than that of the ARIMA model. The XGBoost model is more suitable for prediction cases of human brucellosis in mainland China. BMJ Publishing Group 2020-12-07 /pmc/articles/PMC7722837/ /pubmed/33293308 http://dx.doi.org/10.1136/bmjopen-2020-039676 Text en © Author(s) (or their employer(s)) 2020. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. http://creativecommons.org/licenses/by-nc/4.0/ http://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/. |
spellingShingle | Epidemiology Alim, Mirxat Ye, Guo-Hua Guan, Peng Huang, De-Sheng Zhou, Bao-Sen Wu, Wei Comparison of ARIMA model and XGBoost model for prediction of human brucellosis in mainland China: a time-series study |
title | Comparison of ARIMA model and XGBoost model for prediction of human brucellosis in mainland China: a time-series study |
title_full | Comparison of ARIMA model and XGBoost model for prediction of human brucellosis in mainland China: a time-series study |
title_fullStr | Comparison of ARIMA model and XGBoost model for prediction of human brucellosis in mainland China: a time-series study |
title_full_unstemmed | Comparison of ARIMA model and XGBoost model for prediction of human brucellosis in mainland China: a time-series study |
title_short | Comparison of ARIMA model and XGBoost model for prediction of human brucellosis in mainland China: a time-series study |
title_sort | comparison of arima model and xgboost model for prediction of human brucellosis in mainland china: a time-series study |
topic | Epidemiology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7722837/ https://www.ncbi.nlm.nih.gov/pubmed/33293308 http://dx.doi.org/10.1136/bmjopen-2020-039676 |
work_keys_str_mv | AT alimmirxat comparisonofarimamodelandxgboostmodelforpredictionofhumanbrucellosisinmainlandchinaatimeseriesstudy AT yeguohua comparisonofarimamodelandxgboostmodelforpredictionofhumanbrucellosisinmainlandchinaatimeseriesstudy AT guanpeng comparisonofarimamodelandxgboostmodelforpredictionofhumanbrucellosisinmainlandchinaatimeseriesstudy AT huangdesheng comparisonofarimamodelandxgboostmodelforpredictionofhumanbrucellosisinmainlandchinaatimeseriesstudy AT zhoubaosen comparisonofarimamodelandxgboostmodelforpredictionofhumanbrucellosisinmainlandchinaatimeseriesstudy AT wuwei comparisonofarimamodelandxgboostmodelforpredictionofhumanbrucellosisinmainlandchinaatimeseriesstudy |