Cargando…
Predictive Analysis of Diabetes-Risk with Class Imbalance
Diabetes type 2 (T2DM) is a common chronic disease, increasingly leading to many complications and affecting vital organs. Hyperglycemia is the main characteristic caused by insufficient insulin secretion and poses a serious risk to human health. The objective is to construct a type-2 diabetes predi...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9578843/ https://www.ncbi.nlm.nih.gov/pubmed/36268149 http://dx.doi.org/10.1155/2022/3078025 |
_version_ | 1784812049556045824 |
---|---|
author | ElSeddawy, Ahmed I. Karim, Faten Khalid Hussein, Aisha Mohamed Khafaga, Doaa Sami |
author_facet | ElSeddawy, Ahmed I. Karim, Faten Khalid Hussein, Aisha Mohamed Khafaga, Doaa Sami |
author_sort | ElSeddawy, Ahmed I. |
collection | PubMed |
description | Diabetes type 2 (T2DM) is a common chronic disease, increasingly leading to many complications and affecting vital organs. Hyperglycemia is the main characteristic caused by insufficient insulin secretion and poses a serious risk to human health. The objective is to construct a type-2 diabetes prediction model with high classification accuracy. Advanced machine learning and predictive model techniques are utilized to achieve cutting-edge techniques for the early diagnosis of diabetes. This paper proposes an efficient performance model to predict and classify the minority class of type-2 diabetes. The impact of oversampling and undersampling approaches to reduce the effect of an unbalanced class has been compared to classification performance algorithms. Synthetic Minority Oversampling (SMOTE) and Tomek-links techniques are applied and examined. The outcomes were then compared to the original unbalanced dataset using an artificial neural network (ANN) predictive model. The model is compared with other state-of-the-art classifiers such as support vector machine (SVM), random forest (RF), and decision tree (DT). The tuned model had the best accuracy of 92.2%. The experimental findings clearly manifest the improvement in accuracy and evaluation metrics in terms of AUC and F1-measure using the SMOTE oversampling strategy rather than the baseline and undersampling schemes. The study recommends adopting dynamic hyperparameter optimization to further improve accuracy. |
format | Online Article Text |
id | pubmed-9578843 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-95788432022-10-19 Predictive Analysis of Diabetes-Risk with Class Imbalance ElSeddawy, Ahmed I. Karim, Faten Khalid Hussein, Aisha Mohamed Khafaga, Doaa Sami Comput Intell Neurosci Research Article Diabetes type 2 (T2DM) is a common chronic disease, increasingly leading to many complications and affecting vital organs. Hyperglycemia is the main characteristic caused by insufficient insulin secretion and poses a serious risk to human health. The objective is to construct a type-2 diabetes prediction model with high classification accuracy. Advanced machine learning and predictive model techniques are utilized to achieve cutting-edge techniques for the early diagnosis of diabetes. This paper proposes an efficient performance model to predict and classify the minority class of type-2 diabetes. The impact of oversampling and undersampling approaches to reduce the effect of an unbalanced class has been compared to classification performance algorithms. Synthetic Minority Oversampling (SMOTE) and Tomek-links techniques are applied and examined. The outcomes were then compared to the original unbalanced dataset using an artificial neural network (ANN) predictive model. The model is compared with other state-of-the-art classifiers such as support vector machine (SVM), random forest (RF), and decision tree (DT). The tuned model had the best accuracy of 92.2%. The experimental findings clearly manifest the improvement in accuracy and evaluation metrics in terms of AUC and F1-measure using the SMOTE oversampling strategy rather than the baseline and undersampling schemes. The study recommends adopting dynamic hyperparameter optimization to further improve accuracy. Hindawi 2022-10-11 /pmc/articles/PMC9578843/ /pubmed/36268149 http://dx.doi.org/10.1155/2022/3078025 Text en Copyright © 2022 Ahmed I. ElSeddawy et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article ElSeddawy, Ahmed I. Karim, Faten Khalid Hussein, Aisha Mohamed Khafaga, Doaa Sami Predictive Analysis of Diabetes-Risk with Class Imbalance |
title | Predictive Analysis of Diabetes-Risk with Class Imbalance |
title_full | Predictive Analysis of Diabetes-Risk with Class Imbalance |
title_fullStr | Predictive Analysis of Diabetes-Risk with Class Imbalance |
title_full_unstemmed | Predictive Analysis of Diabetes-Risk with Class Imbalance |
title_short | Predictive Analysis of Diabetes-Risk with Class Imbalance |
title_sort | predictive analysis of diabetes-risk with class imbalance |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9578843/ https://www.ncbi.nlm.nih.gov/pubmed/36268149 http://dx.doi.org/10.1155/2022/3078025 |
work_keys_str_mv | AT elseddawyahmedi predictiveanalysisofdiabetesriskwithclassimbalance AT karimfatenkhalid predictiveanalysisofdiabetesriskwithclassimbalance AT husseinaishamohamed predictiveanalysisofdiabetesriskwithclassimbalance AT khafagadoaasami predictiveanalysisofdiabetesriskwithclassimbalance |