Cargando…
Diabetes mellitus early warning and factor analysis using ensemble Bayesian networks with SMOTE-ENN and Boruta
Diabetes mellitus (DM) has become the third chronic non-infectious disease affecting patients after tumor, cardiovascular and cerebrovascular diseases, becoming one of the major public health issues worldwide. Detection of early warning risk factors for DM is key to the prevention of DM, which has b...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10404250/ https://www.ncbi.nlm.nih.gov/pubmed/37543637 http://dx.doi.org/10.1038/s41598-023-40036-5 |
_version_ | 1785085257065693184 |
---|---|
author | Wang, Xuchun Ren, Jiahui Ren, Hao Song, Wenzhu Qiao, Yuchao Zhao, Ying Linghu, Liqin Cui, Yu Zhao, Zhiyang Chen, Limin Qiu, Lixia |
author_facet | Wang, Xuchun Ren, Jiahui Ren, Hao Song, Wenzhu Qiao, Yuchao Zhao, Ying Linghu, Liqin Cui, Yu Zhao, Zhiyang Chen, Limin Qiu, Lixia |
author_sort | Wang, Xuchun |
collection | PubMed |
description | Diabetes mellitus (DM) has become the third chronic non-infectious disease affecting patients after tumor, cardiovascular and cerebrovascular diseases, becoming one of the major public health issues worldwide. Detection of early warning risk factors for DM is key to the prevention of DM, which has been the focus of some previous studies. Therefore, from the perspective of residents' self-management and prevention, this study constructed Bayesian networks (BNs) combining feature screening and multiple resampling techniques for DM monitoring data with a class imbalance in Shanxi Province, China, to detect risk factors in chronic disease monitoring programs and predict the risk of DM. First, univariate analysis and Boruta feature selection algorithm were employed to conduct the preliminary screening of all included risk factors. Then, three resampling techniques, SMOTE, Borderline-SMOTE (BL-SMOTE) and SMOTE-ENN, were adopted to deal with data imbalance. Finally, BNs developed by three algorithms (Tabu, Hill-climbing and MMHC) were constructed using the processed data to find the warning factors that strongly correlate with DM. The results showed that the accuracy of DM classification is significantly improved by the BNs constructed by processed data. In particular, the BNs combined with the SMOTE-ENN resampling improved the most, and the BNs constructed by the Tabu algorithm obtained the best classification performance compared with the hill-climbing and MMHC algorithms. The best-performing joint Boruta-SMOTE-ENN-Tabu model showed that the risk factors of DM included family history, age, central obesity, hyperlipidemia, salt reduction, occupation, heart rate, and BMI. |
format | Online Article Text |
id | pubmed-10404250 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-104042502023-08-07 Diabetes mellitus early warning and factor analysis using ensemble Bayesian networks with SMOTE-ENN and Boruta Wang, Xuchun Ren, Jiahui Ren, Hao Song, Wenzhu Qiao, Yuchao Zhao, Ying Linghu, Liqin Cui, Yu Zhao, Zhiyang Chen, Limin Qiu, Lixia Sci Rep Article Diabetes mellitus (DM) has become the third chronic non-infectious disease affecting patients after tumor, cardiovascular and cerebrovascular diseases, becoming one of the major public health issues worldwide. Detection of early warning risk factors for DM is key to the prevention of DM, which has been the focus of some previous studies. Therefore, from the perspective of residents' self-management and prevention, this study constructed Bayesian networks (BNs) combining feature screening and multiple resampling techniques for DM monitoring data with a class imbalance in Shanxi Province, China, to detect risk factors in chronic disease monitoring programs and predict the risk of DM. First, univariate analysis and Boruta feature selection algorithm were employed to conduct the preliminary screening of all included risk factors. Then, three resampling techniques, SMOTE, Borderline-SMOTE (BL-SMOTE) and SMOTE-ENN, were adopted to deal with data imbalance. Finally, BNs developed by three algorithms (Tabu, Hill-climbing and MMHC) were constructed using the processed data to find the warning factors that strongly correlate with DM. The results showed that the accuracy of DM classification is significantly improved by the BNs constructed by processed data. In particular, the BNs combined with the SMOTE-ENN resampling improved the most, and the BNs constructed by the Tabu algorithm obtained the best classification performance compared with the hill-climbing and MMHC algorithms. The best-performing joint Boruta-SMOTE-ENN-Tabu model showed that the risk factors of DM included family history, age, central obesity, hyperlipidemia, salt reduction, occupation, heart rate, and BMI. Nature Publishing Group UK 2023-08-05 /pmc/articles/PMC10404250/ /pubmed/37543637 http://dx.doi.org/10.1038/s41598-023-40036-5 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Wang, Xuchun Ren, Jiahui Ren, Hao Song, Wenzhu Qiao, Yuchao Zhao, Ying Linghu, Liqin Cui, Yu Zhao, Zhiyang Chen, Limin Qiu, Lixia Diabetes mellitus early warning and factor analysis using ensemble Bayesian networks with SMOTE-ENN and Boruta |
title | Diabetes mellitus early warning and factor analysis using ensemble Bayesian networks with SMOTE-ENN and Boruta |
title_full | Diabetes mellitus early warning and factor analysis using ensemble Bayesian networks with SMOTE-ENN and Boruta |
title_fullStr | Diabetes mellitus early warning and factor analysis using ensemble Bayesian networks with SMOTE-ENN and Boruta |
title_full_unstemmed | Diabetes mellitus early warning and factor analysis using ensemble Bayesian networks with SMOTE-ENN and Boruta |
title_short | Diabetes mellitus early warning and factor analysis using ensemble Bayesian networks with SMOTE-ENN and Boruta |
title_sort | diabetes mellitus early warning and factor analysis using ensemble bayesian networks with smote-enn and boruta |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10404250/ https://www.ncbi.nlm.nih.gov/pubmed/37543637 http://dx.doi.org/10.1038/s41598-023-40036-5 |
work_keys_str_mv | AT wangxuchun diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta AT renjiahui diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta AT renhao diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta AT songwenzhu diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta AT qiaoyuchao diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta AT zhaoying diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta AT linghuliqin diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta AT cuiyu diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta AT zhaozhiyang diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta AT chenlimin diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta AT qiulixia diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta |