Cargando…

Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea

Severe fever with thrombocytopenia syndrome (SFTS) is an emerging tick-borne infectious disease in China, Japan, and Korea. This study aimed to estimate the monthly SFTS occurrence and the monthly number of SFTS cases in the geographical area in Korea using epidemiological data including demographic...

Descripción completa

Detalles Bibliográficos
Autores principales: Cho, Giphil, Lee, Seungheon, Lee, Hyojung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8575988/
https://www.ncbi.nlm.nih.gov/pubmed/34750465
http://dx.doi.org/10.1038/s41598-021-01361-9
_version_ 1784595790853832704
author Cho, Giphil
Lee, Seungheon
Lee, Hyojung
author_facet Cho, Giphil
Lee, Seungheon
Lee, Hyojung
author_sort Cho, Giphil
collection PubMed
description Severe fever with thrombocytopenia syndrome (SFTS) is an emerging tick-borne infectious disease in China, Japan, and Korea. This study aimed to estimate the monthly SFTS occurrence and the monthly number of SFTS cases in the geographical area in Korea using epidemiological data including demographic, geographic, and meteorological factors. Important features were chosen through univariate feature selection. Two models using machine learning methods were analyzed: the classification model in machine learning (CMML) and regression model in machine learning (RMML). We developed a novel model incorporating the CMML results into RMML, defined as modified-RMML. Feature importance was computed to assess the contribution of estimating the number of SFTS cases using modified-RMML. Aspect to the accuracy of the novel model, the performance of modified-RMML was improved by reducing the MSE for the test data as 12.6–52.2%, compared to the RMML using five machine learning methods. During the period of increasing the SFTS cases from May to October, the modified-RMML could give more accurate estimation. Computing the feature importance, it is clearly observed that climate factors such as average maximum temperature, precipitation as well as mountain visitors, and the estimation of SFTS occurrence obtained from CMML had high Gini importance. The novel model incorporating CMML and RMML models improves the accuracy of the estimation of SFTS cases. Using the model, climate factors, including temperature, relative humidity, and mountain visitors play important roles in transmitting SFTS in Korea. Our findings highlighted that the guidelines for mountain visitors to prevent SFTS transmissions should be addressed. Moreover, it provides important insights for establishing control interventions that predict early identification of SFTS cases.
format Online
Article
Text
id pubmed-8575988
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-85759882021-11-10 Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea Cho, Giphil Lee, Seungheon Lee, Hyojung Sci Rep Article Severe fever with thrombocytopenia syndrome (SFTS) is an emerging tick-borne infectious disease in China, Japan, and Korea. This study aimed to estimate the monthly SFTS occurrence and the monthly number of SFTS cases in the geographical area in Korea using epidemiological data including demographic, geographic, and meteorological factors. Important features were chosen through univariate feature selection. Two models using machine learning methods were analyzed: the classification model in machine learning (CMML) and regression model in machine learning (RMML). We developed a novel model incorporating the CMML results into RMML, defined as modified-RMML. Feature importance was computed to assess the contribution of estimating the number of SFTS cases using modified-RMML. Aspect to the accuracy of the novel model, the performance of modified-RMML was improved by reducing the MSE for the test data as 12.6–52.2%, compared to the RMML using five machine learning methods. During the period of increasing the SFTS cases from May to October, the modified-RMML could give more accurate estimation. Computing the feature importance, it is clearly observed that climate factors such as average maximum temperature, precipitation as well as mountain visitors, and the estimation of SFTS occurrence obtained from CMML had high Gini importance. The novel model incorporating CMML and RMML models improves the accuracy of the estimation of SFTS cases. Using the model, climate factors, including temperature, relative humidity, and mountain visitors play important roles in transmitting SFTS in Korea. Our findings highlighted that the guidelines for mountain visitors to prevent SFTS transmissions should be addressed. Moreover, it provides important insights for establishing control interventions that predict early identification of SFTS cases. Nature Publishing Group UK 2021-11-08 /pmc/articles/PMC8575988/ /pubmed/34750465 http://dx.doi.org/10.1038/s41598-021-01361-9 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Cho, Giphil
Lee, Seungheon
Lee, Hyojung
Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea
title Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea
title_full Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea
title_fullStr Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea
title_full_unstemmed Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea
title_short Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea
title_sort estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in south korea
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8575988/
https://www.ncbi.nlm.nih.gov/pubmed/34750465
http://dx.doi.org/10.1038/s41598-021-01361-9
work_keys_str_mv AT chogiphil estimatingseverefeverwiththrombocytopeniasyndrometransmissionusingmachinelearningmethodsinsouthkorea
AT leeseungheon estimatingseverefeverwiththrombocytopeniasyndrometransmissionusingmachinelearningmethodsinsouthkorea
AT leehyojung estimatingseverefeverwiththrombocytopeniasyndrometransmissionusingmachinelearningmethodsinsouthkorea