Cargando…

Analysis and prediction of hand, foot and mouth disease incidence in China using Random Forest and XGBoost

Hand, foot and mouth disease (HFMD) is an increasingly serious public health problem, and it has caused an outbreak in China every year since 2008. Predicting the incidence of HFMD and analyzing its influential factors are of great significance to its prevention. Now, machine learning has shown adva...

Descripción completa

Detalles Bibliográficos
Autores principales: Meng, Delin, Xu, Jun, Zhao, Jijun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8694472/
https://www.ncbi.nlm.nih.gov/pubmed/34936688
http://dx.doi.org/10.1371/journal.pone.0261629
_version_ 1784619363943317504
author Meng, Delin
Xu, Jun
Zhao, Jijun
author_facet Meng, Delin
Xu, Jun
Zhao, Jijun
author_sort Meng, Delin
collection PubMed
description Hand, foot and mouth disease (HFMD) is an increasingly serious public health problem, and it has caused an outbreak in China every year since 2008. Predicting the incidence of HFMD and analyzing its influential factors are of great significance to its prevention. Now, machine learning has shown advantages in infectious disease models, but there are few studies on HFMD incidence based on machine learning that cover all the provinces in mainland China. In this study, we proposed two different machine learning algorithms, Random Forest and eXtreme Gradient Boosting (XGBoost), to perform our analysis and prediction. We first used Random Forest to examine the association between HFMD incidence and potential influential factors for 31 provinces in mainland China. Next, we established Random Forest and XGBoost prediction models using meteorological and social factors as the predictors. Finally, we applied our prediction models in four different regions of mainland China and evaluated the performance of them. Our results show that: 1) Meteorological factors and social factors jointly affect the incidence of HFMD in mainland China. Average temperature and population density are the two most significant influential factors; 2) Population flux has different delayed effect in affecting HFMD incidence in different regions. From a national perspective, the model using population flux data delayed for one month has better prediction performance; 3) The prediction capability of XGBoost model was better than that of Random Forest model from the overall perspective. XGBoost model is more suitable for predicting the incidence of HFMD in mainland China.
format Online
Article
Text
id pubmed-8694472
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-86944722021-12-23 Analysis and prediction of hand, foot and mouth disease incidence in China using Random Forest and XGBoost Meng, Delin Xu, Jun Zhao, Jijun PLoS One Research Article Hand, foot and mouth disease (HFMD) is an increasingly serious public health problem, and it has caused an outbreak in China every year since 2008. Predicting the incidence of HFMD and analyzing its influential factors are of great significance to its prevention. Now, machine learning has shown advantages in infectious disease models, but there are few studies on HFMD incidence based on machine learning that cover all the provinces in mainland China. In this study, we proposed two different machine learning algorithms, Random Forest and eXtreme Gradient Boosting (XGBoost), to perform our analysis and prediction. We first used Random Forest to examine the association between HFMD incidence and potential influential factors for 31 provinces in mainland China. Next, we established Random Forest and XGBoost prediction models using meteorological and social factors as the predictors. Finally, we applied our prediction models in four different regions of mainland China and evaluated the performance of them. Our results show that: 1) Meteorological factors and social factors jointly affect the incidence of HFMD in mainland China. Average temperature and population density are the two most significant influential factors; 2) Population flux has different delayed effect in affecting HFMD incidence in different regions. From a national perspective, the model using population flux data delayed for one month has better prediction performance; 3) The prediction capability of XGBoost model was better than that of Random Forest model from the overall perspective. XGBoost model is more suitable for predicting the incidence of HFMD in mainland China. Public Library of Science 2021-12-22 /pmc/articles/PMC8694472/ /pubmed/34936688 http://dx.doi.org/10.1371/journal.pone.0261629 Text en © 2021 Meng et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Meng, Delin
Xu, Jun
Zhao, Jijun
Analysis and prediction of hand, foot and mouth disease incidence in China using Random Forest and XGBoost
title Analysis and prediction of hand, foot and mouth disease incidence in China using Random Forest and XGBoost
title_full Analysis and prediction of hand, foot and mouth disease incidence in China using Random Forest and XGBoost
title_fullStr Analysis and prediction of hand, foot and mouth disease incidence in China using Random Forest and XGBoost
title_full_unstemmed Analysis and prediction of hand, foot and mouth disease incidence in China using Random Forest and XGBoost
title_short Analysis and prediction of hand, foot and mouth disease incidence in China using Random Forest and XGBoost
title_sort analysis and prediction of hand, foot and mouth disease incidence in china using random forest and xgboost
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8694472/
https://www.ncbi.nlm.nih.gov/pubmed/34936688
http://dx.doi.org/10.1371/journal.pone.0261629
work_keys_str_mv AT mengdelin analysisandpredictionofhandfootandmouthdiseaseincidenceinchinausingrandomforestandxgboost
AT xujun analysisandpredictionofhandfootandmouthdiseaseincidenceinchinausingrandomforestandxgboost
AT zhaojijun analysisandpredictionofhandfootandmouthdiseaseincidenceinchinausingrandomforestandxgboost