Cargando…

Predictive Analysis of Diabetes-Risk with Class Imbalance

Diabetes type 2 (T2DM) is a common chronic disease, increasingly leading to many complications and affecting vital organs. Hyperglycemia is the main characteristic caused by insufficient insulin secretion and poses a serious risk to human health. The objective is to construct a type-2 diabetes predi...

Descripción completa

Detalles Bibliográficos
Autores principales: ElSeddawy, Ahmed I., Karim, Faten Khalid, Hussein, Aisha Mohamed, Khafaga, Doaa Sami
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9578843/
https://www.ncbi.nlm.nih.gov/pubmed/36268149
http://dx.doi.org/10.1155/2022/3078025
_version_ 1784812049556045824
author ElSeddawy, Ahmed I.
Karim, Faten Khalid
Hussein, Aisha Mohamed
Khafaga, Doaa Sami
author_facet ElSeddawy, Ahmed I.
Karim, Faten Khalid
Hussein, Aisha Mohamed
Khafaga, Doaa Sami
author_sort ElSeddawy, Ahmed I.
collection PubMed
description Diabetes type 2 (T2DM) is a common chronic disease, increasingly leading to many complications and affecting vital organs. Hyperglycemia is the main characteristic caused by insufficient insulin secretion and poses a serious risk to human health. The objective is to construct a type-2 diabetes prediction model with high classification accuracy. Advanced machine learning and predictive model techniques are utilized to achieve cutting-edge techniques for the early diagnosis of diabetes. This paper proposes an efficient performance model to predict and classify the minority class of type-2 diabetes. The impact of oversampling and undersampling approaches to reduce the effect of an unbalanced class has been compared to classification performance algorithms. Synthetic Minority Oversampling (SMOTE) and Tomek-links techniques are applied and examined. The outcomes were then compared to the original unbalanced dataset using an artificial neural network (ANN) predictive model. The model is compared with other state-of-the-art classifiers such as support vector machine (SVM), random forest (RF), and decision tree (DT). The tuned model had the best accuracy of 92.2%. The experimental findings clearly manifest the improvement in accuracy and evaluation metrics in terms of AUC and F1-measure using the SMOTE oversampling strategy rather than the baseline and undersampling schemes. The study recommends adopting dynamic hyperparameter optimization to further improve accuracy.
format Online
Article
Text
id pubmed-9578843
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-95788432022-10-19 Predictive Analysis of Diabetes-Risk with Class Imbalance ElSeddawy, Ahmed I. Karim, Faten Khalid Hussein, Aisha Mohamed Khafaga, Doaa Sami Comput Intell Neurosci Research Article Diabetes type 2 (T2DM) is a common chronic disease, increasingly leading to many complications and affecting vital organs. Hyperglycemia is the main characteristic caused by insufficient insulin secretion and poses a serious risk to human health. The objective is to construct a type-2 diabetes prediction model with high classification accuracy. Advanced machine learning and predictive model techniques are utilized to achieve cutting-edge techniques for the early diagnosis of diabetes. This paper proposes an efficient performance model to predict and classify the minority class of type-2 diabetes. The impact of oversampling and undersampling approaches to reduce the effect of an unbalanced class has been compared to classification performance algorithms. Synthetic Minority Oversampling (SMOTE) and Tomek-links techniques are applied and examined. The outcomes were then compared to the original unbalanced dataset using an artificial neural network (ANN) predictive model. The model is compared with other state-of-the-art classifiers such as support vector machine (SVM), random forest (RF), and decision tree (DT). The tuned model had the best accuracy of 92.2%. The experimental findings clearly manifest the improvement in accuracy and evaluation metrics in terms of AUC and F1-measure using the SMOTE oversampling strategy rather than the baseline and undersampling schemes. The study recommends adopting dynamic hyperparameter optimization to further improve accuracy. Hindawi 2022-10-11 /pmc/articles/PMC9578843/ /pubmed/36268149 http://dx.doi.org/10.1155/2022/3078025 Text en Copyright © 2022 Ahmed I. ElSeddawy et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
ElSeddawy, Ahmed I.
Karim, Faten Khalid
Hussein, Aisha Mohamed
Khafaga, Doaa Sami
Predictive Analysis of Diabetes-Risk with Class Imbalance
title Predictive Analysis of Diabetes-Risk with Class Imbalance
title_full Predictive Analysis of Diabetes-Risk with Class Imbalance
title_fullStr Predictive Analysis of Diabetes-Risk with Class Imbalance
title_full_unstemmed Predictive Analysis of Diabetes-Risk with Class Imbalance
title_short Predictive Analysis of Diabetes-Risk with Class Imbalance
title_sort predictive analysis of diabetes-risk with class imbalance
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9578843/
https://www.ncbi.nlm.nih.gov/pubmed/36268149
http://dx.doi.org/10.1155/2022/3078025
work_keys_str_mv AT elseddawyahmedi predictiveanalysisofdiabetesriskwithclassimbalance
AT karimfatenkhalid predictiveanalysisofdiabetesriskwithclassimbalance
AT husseinaishamohamed predictiveanalysisofdiabetesriskwithclassimbalance
AT khafagadoaasami predictiveanalysisofdiabetesriskwithclassimbalance