Cargando…
Identification of the high-risk area for schistosomiasis transmission in China based on information value and machine learning: a newly data-driven modeling attempt
BACKGROUND: Schistosomiasis control is striving forward to transmission interruption and even elimination, evidence-lead control is of vital importance to eliminate the hidden dangers of schistosomiasis. This study attempts to identify high risk areas of schistosomiasis in China by using information...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8237418/ https://www.ncbi.nlm.nih.gov/pubmed/34176515 http://dx.doi.org/10.1186/s40249-021-00874-9 |
_version_ | 1783714723062087680 |
---|---|
author | Gong, Yan-Feng Zhu, Ling-Qian Li, Yin-Long Zhang, Li-Juan Xue, Jing-Bo Xia, Shang Lv, Shan Xu, Jing Li, Shi-Zhu |
author_facet | Gong, Yan-Feng Zhu, Ling-Qian Li, Yin-Long Zhang, Li-Juan Xue, Jing-Bo Xia, Shang Lv, Shan Xu, Jing Li, Shi-Zhu |
author_sort | Gong, Yan-Feng |
collection | PubMed |
description | BACKGROUND: Schistosomiasis control is striving forward to transmission interruption and even elimination, evidence-lead control is of vital importance to eliminate the hidden dangers of schistosomiasis. This study attempts to identify high risk areas of schistosomiasis in China by using information value and machine learning. METHODS: The local case distribution from schistosomiasis surveillance data in China between 2005 and 2019 was assessed based on 19 variables including climate, geography, and social economy. Seven models were built in three categories including information value (IV), three machine learning models [logistic regression (LR), random forest (RF), generalized boosted model (GBM)], and three coupled models (IV + LR, IV + RF, IV + GBM). Accuracy, area under the curve (AUC), and F1-score were used to evaluate the prediction performance of the models. The optimal model was selected to predict the risk distribution for schistosomiasis. RESULTS: There is a more prone to schistosomiasis epidemic provided that paddy fields, grasslands, less than 2.5 km from the waterway, annual average temperature of 11.5–19.0 °C, annual average rainfall of 1000–1550 mm. IV + GBM had the highest prediction effect (accuracy = 0.878, AUC = 0.902, F1 = 0.920) compared with the other six models. The results of IV + GBM showed that the risk areas are mainly distributed in the coastal regions of the middle and lower reaches of the Yangtze River, the Poyang Lake region, and the Dongting Lake region. High-risk areas are primarily distributed in eastern Changde, western Yueyang, northeastern Yiyang, middle Changsha of Hunan province; southern Jiujiang, northern Nanchang, northeastern Shangrao, eastern Yichun in Jiangxi province; southern Jingzhou, southern Xiantao, middle Wuhan in Hubei province; southern Anqing, northwestern Guichi, eastern Wuhu in Anhui province; middle Meishan, northern Leshan, and the middle of Liangshan in Sichuan province. CONCLUSIONS: The risk of schistosomiasis transmission in China still exists, with high-risk areas relatively concentrated in the coastal regions of the middle and lower reaches of the Yangtze River. Coupled models of IV and machine learning provide for effective analysis and prediction, forming a scientific basis for evidence-lead surveillance and control. GRAPHIC ABSTRACT: [Image: see text] SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40249-021-00874-9. |
format | Online Article Text |
id | pubmed-8237418 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-82374182021-06-29 Identification of the high-risk area for schistosomiasis transmission in China based on information value and machine learning: a newly data-driven modeling attempt Gong, Yan-Feng Zhu, Ling-Qian Li, Yin-Long Zhang, Li-Juan Xue, Jing-Bo Xia, Shang Lv, Shan Xu, Jing Li, Shi-Zhu Infect Dis Poverty Research Article BACKGROUND: Schistosomiasis control is striving forward to transmission interruption and even elimination, evidence-lead control is of vital importance to eliminate the hidden dangers of schistosomiasis. This study attempts to identify high risk areas of schistosomiasis in China by using information value and machine learning. METHODS: The local case distribution from schistosomiasis surveillance data in China between 2005 and 2019 was assessed based on 19 variables including climate, geography, and social economy. Seven models were built in three categories including information value (IV), three machine learning models [logistic regression (LR), random forest (RF), generalized boosted model (GBM)], and three coupled models (IV + LR, IV + RF, IV + GBM). Accuracy, area under the curve (AUC), and F1-score were used to evaluate the prediction performance of the models. The optimal model was selected to predict the risk distribution for schistosomiasis. RESULTS: There is a more prone to schistosomiasis epidemic provided that paddy fields, grasslands, less than 2.5 km from the waterway, annual average temperature of 11.5–19.0 °C, annual average rainfall of 1000–1550 mm. IV + GBM had the highest prediction effect (accuracy = 0.878, AUC = 0.902, F1 = 0.920) compared with the other six models. The results of IV + GBM showed that the risk areas are mainly distributed in the coastal regions of the middle and lower reaches of the Yangtze River, the Poyang Lake region, and the Dongting Lake region. High-risk areas are primarily distributed in eastern Changde, western Yueyang, northeastern Yiyang, middle Changsha of Hunan province; southern Jiujiang, northern Nanchang, northeastern Shangrao, eastern Yichun in Jiangxi province; southern Jingzhou, southern Xiantao, middle Wuhan in Hubei province; southern Anqing, northwestern Guichi, eastern Wuhu in Anhui province; middle Meishan, northern Leshan, and the middle of Liangshan in Sichuan province. CONCLUSIONS: The risk of schistosomiasis transmission in China still exists, with high-risk areas relatively concentrated in the coastal regions of the middle and lower reaches of the Yangtze River. Coupled models of IV and machine learning provide for effective analysis and prediction, forming a scientific basis for evidence-lead surveillance and control. GRAPHIC ABSTRACT: [Image: see text] SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40249-021-00874-9. BioMed Central 2021-06-27 /pmc/articles/PMC8237418/ /pubmed/34176515 http://dx.doi.org/10.1186/s40249-021-00874-9 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Gong, Yan-Feng Zhu, Ling-Qian Li, Yin-Long Zhang, Li-Juan Xue, Jing-Bo Xia, Shang Lv, Shan Xu, Jing Li, Shi-Zhu Identification of the high-risk area for schistosomiasis transmission in China based on information value and machine learning: a newly data-driven modeling attempt |
title | Identification of the high-risk area for schistosomiasis transmission in China based on information value and machine learning: a newly data-driven modeling attempt |
title_full | Identification of the high-risk area for schistosomiasis transmission in China based on information value and machine learning: a newly data-driven modeling attempt |
title_fullStr | Identification of the high-risk area for schistosomiasis transmission in China based on information value and machine learning: a newly data-driven modeling attempt |
title_full_unstemmed | Identification of the high-risk area for schistosomiasis transmission in China based on information value and machine learning: a newly data-driven modeling attempt |
title_short | Identification of the high-risk area for schistosomiasis transmission in China based on information value and machine learning: a newly data-driven modeling attempt |
title_sort | identification of the high-risk area for schistosomiasis transmission in china based on information value and machine learning: a newly data-driven modeling attempt |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8237418/ https://www.ncbi.nlm.nih.gov/pubmed/34176515 http://dx.doi.org/10.1186/s40249-021-00874-9 |
work_keys_str_mv | AT gongyanfeng identificationofthehighriskareaforschistosomiasistransmissioninchinabasedoninformationvalueandmachinelearninganewlydatadrivenmodelingattempt AT zhulingqian identificationofthehighriskareaforschistosomiasistransmissioninchinabasedoninformationvalueandmachinelearninganewlydatadrivenmodelingattempt AT liyinlong identificationofthehighriskareaforschistosomiasistransmissioninchinabasedoninformationvalueandmachinelearninganewlydatadrivenmodelingattempt AT zhanglijuan identificationofthehighriskareaforschistosomiasistransmissioninchinabasedoninformationvalueandmachinelearninganewlydatadrivenmodelingattempt AT xuejingbo identificationofthehighriskareaforschistosomiasistransmissioninchinabasedoninformationvalueandmachinelearninganewlydatadrivenmodelingattempt AT xiashang identificationofthehighriskareaforschistosomiasistransmissioninchinabasedoninformationvalueandmachinelearninganewlydatadrivenmodelingattempt AT lvshan identificationofthehighriskareaforschistosomiasistransmissioninchinabasedoninformationvalueandmachinelearninganewlydatadrivenmodelingattempt AT xujing identificationofthehighriskareaforschistosomiasistransmissioninchinabasedoninformationvalueandmachinelearninganewlydatadrivenmodelingattempt AT lishizhu identificationofthehighriskareaforschistosomiasistransmissioninchinabasedoninformationvalueandmachinelearninganewlydatadrivenmodelingattempt |