Cargando…

Diabetes prediction using machine learning and explainable AI techniques

Globally, diabetes affects 537 million people, making it the deadliest and the most common non‐communicable disease. Many factors can cause a person to get affected by diabetes, like excessive body weight, abnormal cholesterol level, family history, physical inactivity, bad food habit etc. Increased...

Descripción completa

Detalles Bibliográficos
Autores principales: Tasin, Isfafuzzaman, Nabil, Tansin Ullah, Islam, Sanjida, Khan, Riasat
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10107388/
https://www.ncbi.nlm.nih.gov/pubmed/37077883
http://dx.doi.org/10.1049/htl2.12039
_version_ 1785026594849423360
author Tasin, Isfafuzzaman
Nabil, Tansin Ullah
Islam, Sanjida
Khan, Riasat
author_facet Tasin, Isfafuzzaman
Nabil, Tansin Ullah
Islam, Sanjida
Khan, Riasat
author_sort Tasin, Isfafuzzaman
collection PubMed
description Globally, diabetes affects 537 million people, making it the deadliest and the most common non‐communicable disease. Many factors can cause a person to get affected by diabetes, like excessive body weight, abnormal cholesterol level, family history, physical inactivity, bad food habit etc. Increased urination is one of the most common symptoms of this disease. People with diabetes for a long time can get several complications like heart disorder, kidney disease, nerve damage, diabetic retinopathy etc. But its risk can be reduced if it is predicted early. In this paper, an automatic diabetes prediction system has been developed using a private dataset of female patients in Bangladesh and various machine learning techniques. The authors used the Pima Indian diabetes dataset and collected additional samples from 203 individuals from a local textile factory in Bangladesh. Feature selection algorithm mutual information has been applied in this work. A semi‐supervised model with extreme gradient boosting has been utilized to predict the insulin features of the private dataset. SMOTE and ADASYN approaches have been employed to manage the class imbalance problem. The authors used machine learning classification methods, that is, decision tree, SVM, Random Forest, Logistic Regression, KNN, and various ensemble techniques, to determine which algorithm produces the best prediction results. After training on and testing all the classification models, the proposed system provided the best result in the XGBoost classifier with the ADASYN approach with 81% accuracy, 0.81 F1 coefficient and AUC of 0.84. Furthermore, the domain adaptation method has been implemented to demonstrate the versatility of the proposed system. The explainable AI approach with LIME and SHAP frameworks is implemented to understand how the model predicts the final results. Finally, a website framework and an Android smartphone application have been developed to input various features and predict diabetes instantaneously. The private dataset of female Bangladeshi patients and programming codes are available at the following link: https://github.com/tansin-nabil/Diabetes-Prediction-Using-Machine-Learning.
format Online
Article
Text
id pubmed-10107388
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-101073882023-04-18 Diabetes prediction using machine learning and explainable AI techniques Tasin, Isfafuzzaman Nabil, Tansin Ullah Islam, Sanjida Khan, Riasat Healthc Technol Lett Letters Globally, diabetes affects 537 million people, making it the deadliest and the most common non‐communicable disease. Many factors can cause a person to get affected by diabetes, like excessive body weight, abnormal cholesterol level, family history, physical inactivity, bad food habit etc. Increased urination is one of the most common symptoms of this disease. People with diabetes for a long time can get several complications like heart disorder, kidney disease, nerve damage, diabetic retinopathy etc. But its risk can be reduced if it is predicted early. In this paper, an automatic diabetes prediction system has been developed using a private dataset of female patients in Bangladesh and various machine learning techniques. The authors used the Pima Indian diabetes dataset and collected additional samples from 203 individuals from a local textile factory in Bangladesh. Feature selection algorithm mutual information has been applied in this work. A semi‐supervised model with extreme gradient boosting has been utilized to predict the insulin features of the private dataset. SMOTE and ADASYN approaches have been employed to manage the class imbalance problem. The authors used machine learning classification methods, that is, decision tree, SVM, Random Forest, Logistic Regression, KNN, and various ensemble techniques, to determine which algorithm produces the best prediction results. After training on and testing all the classification models, the proposed system provided the best result in the XGBoost classifier with the ADASYN approach with 81% accuracy, 0.81 F1 coefficient and AUC of 0.84. Furthermore, the domain adaptation method has been implemented to demonstrate the versatility of the proposed system. The explainable AI approach with LIME and SHAP frameworks is implemented to understand how the model predicts the final results. Finally, a website framework and an Android smartphone application have been developed to input various features and predict diabetes instantaneously. The private dataset of female Bangladeshi patients and programming codes are available at the following link: https://github.com/tansin-nabil/Diabetes-Prediction-Using-Machine-Learning. John Wiley and Sons Inc. 2022-12-14 /pmc/articles/PMC10107388/ /pubmed/37077883 http://dx.doi.org/10.1049/htl2.12039 Text en © 2022 The Authors. Healthcare Technology Letters published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Letters
Tasin, Isfafuzzaman
Nabil, Tansin Ullah
Islam, Sanjida
Khan, Riasat
Diabetes prediction using machine learning and explainable AI techniques
title Diabetes prediction using machine learning and explainable AI techniques
title_full Diabetes prediction using machine learning and explainable AI techniques
title_fullStr Diabetes prediction using machine learning and explainable AI techniques
title_full_unstemmed Diabetes prediction using machine learning and explainable AI techniques
title_short Diabetes prediction using machine learning and explainable AI techniques
title_sort diabetes prediction using machine learning and explainable ai techniques
topic Letters
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10107388/
https://www.ncbi.nlm.nih.gov/pubmed/37077883
http://dx.doi.org/10.1049/htl2.12039
work_keys_str_mv AT tasinisfafuzzaman diabetespredictionusingmachinelearningandexplainableaitechniques
AT nabiltansinullah diabetespredictionusingmachinelearningandexplainableaitechniques
AT islamsanjida diabetespredictionusingmachinelearningandexplainableaitechniques
AT khanriasat diabetespredictionusingmachinelearningandexplainableaitechniques