Cargando…

Diabetes mellitus early warning and factor analysis using ensemble Bayesian networks with SMOTE-ENN and Boruta

Diabetes mellitus (DM) has become the third chronic non-infectious disease affecting patients after tumor, cardiovascular and cerebrovascular diseases, becoming one of the major public health issues worldwide. Detection of early warning risk factors for DM is key to the prevention of DM, which has b...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Xuchun, Ren, Jiahui, Ren, Hao, Song, Wenzhu, Qiao, Yuchao, Zhao, Ying, Linghu, Liqin, Cui, Yu, Zhao, Zhiyang, Chen, Limin, Qiu, Lixia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10404250/
https://www.ncbi.nlm.nih.gov/pubmed/37543637
http://dx.doi.org/10.1038/s41598-023-40036-5
_version_ 1785085257065693184
author Wang, Xuchun
Ren, Jiahui
Ren, Hao
Song, Wenzhu
Qiao, Yuchao
Zhao, Ying
Linghu, Liqin
Cui, Yu
Zhao, Zhiyang
Chen, Limin
Qiu, Lixia
author_facet Wang, Xuchun
Ren, Jiahui
Ren, Hao
Song, Wenzhu
Qiao, Yuchao
Zhao, Ying
Linghu, Liqin
Cui, Yu
Zhao, Zhiyang
Chen, Limin
Qiu, Lixia
author_sort Wang, Xuchun
collection PubMed
description Diabetes mellitus (DM) has become the third chronic non-infectious disease affecting patients after tumor, cardiovascular and cerebrovascular diseases, becoming one of the major public health issues worldwide. Detection of early warning risk factors for DM is key to the prevention of DM, which has been the focus of some previous studies. Therefore, from the perspective of residents' self-management and prevention, this study constructed Bayesian networks (BNs) combining feature screening and multiple resampling techniques for DM monitoring data with a class imbalance in Shanxi Province, China, to detect risk factors in chronic disease monitoring programs and predict the risk of DM. First, univariate analysis and Boruta feature selection algorithm were employed to conduct the preliminary screening of all included risk factors. Then, three resampling techniques, SMOTE, Borderline-SMOTE (BL-SMOTE) and SMOTE-ENN, were adopted to deal with data imbalance. Finally, BNs developed by three algorithms (Tabu, Hill-climbing and MMHC) were constructed using the processed data to find the warning factors that strongly correlate with DM. The results showed that the accuracy of DM classification is significantly improved by the BNs constructed by processed data. In particular, the BNs combined with the SMOTE-ENN resampling improved the most, and the BNs constructed by the Tabu algorithm obtained the best classification performance compared with the hill-climbing and MMHC algorithms. The best-performing joint Boruta-SMOTE-ENN-Tabu model showed that the risk factors of DM included family history, age, central obesity, hyperlipidemia, salt reduction, occupation, heart rate, and BMI.
format Online
Article
Text
id pubmed-10404250
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-104042502023-08-07 Diabetes mellitus early warning and factor analysis using ensemble Bayesian networks with SMOTE-ENN and Boruta Wang, Xuchun Ren, Jiahui Ren, Hao Song, Wenzhu Qiao, Yuchao Zhao, Ying Linghu, Liqin Cui, Yu Zhao, Zhiyang Chen, Limin Qiu, Lixia Sci Rep Article Diabetes mellitus (DM) has become the third chronic non-infectious disease affecting patients after tumor, cardiovascular and cerebrovascular diseases, becoming one of the major public health issues worldwide. Detection of early warning risk factors for DM is key to the prevention of DM, which has been the focus of some previous studies. Therefore, from the perspective of residents' self-management and prevention, this study constructed Bayesian networks (BNs) combining feature screening and multiple resampling techniques for DM monitoring data with a class imbalance in Shanxi Province, China, to detect risk factors in chronic disease monitoring programs and predict the risk of DM. First, univariate analysis and Boruta feature selection algorithm were employed to conduct the preliminary screening of all included risk factors. Then, three resampling techniques, SMOTE, Borderline-SMOTE (BL-SMOTE) and SMOTE-ENN, were adopted to deal with data imbalance. Finally, BNs developed by three algorithms (Tabu, Hill-climbing and MMHC) were constructed using the processed data to find the warning factors that strongly correlate with DM. The results showed that the accuracy of DM classification is significantly improved by the BNs constructed by processed data. In particular, the BNs combined with the SMOTE-ENN resampling improved the most, and the BNs constructed by the Tabu algorithm obtained the best classification performance compared with the hill-climbing and MMHC algorithms. The best-performing joint Boruta-SMOTE-ENN-Tabu model showed that the risk factors of DM included family history, age, central obesity, hyperlipidemia, salt reduction, occupation, heart rate, and BMI. Nature Publishing Group UK 2023-08-05 /pmc/articles/PMC10404250/ /pubmed/37543637 http://dx.doi.org/10.1038/s41598-023-40036-5 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Wang, Xuchun
Ren, Jiahui
Ren, Hao
Song, Wenzhu
Qiao, Yuchao
Zhao, Ying
Linghu, Liqin
Cui, Yu
Zhao, Zhiyang
Chen, Limin
Qiu, Lixia
Diabetes mellitus early warning and factor analysis using ensemble Bayesian networks with SMOTE-ENN and Boruta
title Diabetes mellitus early warning and factor analysis using ensemble Bayesian networks with SMOTE-ENN and Boruta
title_full Diabetes mellitus early warning and factor analysis using ensemble Bayesian networks with SMOTE-ENN and Boruta
title_fullStr Diabetes mellitus early warning and factor analysis using ensemble Bayesian networks with SMOTE-ENN and Boruta
title_full_unstemmed Diabetes mellitus early warning and factor analysis using ensemble Bayesian networks with SMOTE-ENN and Boruta
title_short Diabetes mellitus early warning and factor analysis using ensemble Bayesian networks with SMOTE-ENN and Boruta
title_sort diabetes mellitus early warning and factor analysis using ensemble bayesian networks with smote-enn and boruta
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10404250/
https://www.ncbi.nlm.nih.gov/pubmed/37543637
http://dx.doi.org/10.1038/s41598-023-40036-5
work_keys_str_mv AT wangxuchun diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta
AT renjiahui diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta
AT renhao diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta
AT songwenzhu diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta
AT qiaoyuchao diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta
AT zhaoying diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta
AT linghuliqin diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta
AT cuiyu diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta
AT zhaozhiyang diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta
AT chenlimin diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta
AT qiulixia diabetesmellitusearlywarningandfactoranalysisusingensemblebayesiannetworkswithsmoteennandboruta