Cargando…

An Ensemble Approach for the Prediction of Diabetes Mellitus Using a Soft Voting Classifier with an Explainable AI

Diabetes is a chronic disease that continues to be a primary and worldwide health concern since the health of the entire population has been affected by it. Over the years, many academics have attempted to develop a reliable diabetes prediction model using machine learning (ML) algorithms. However,...

Descripción completa

Detalles Bibliográficos
Autores principales: Kibria, Hafsa Binte, Nahiduzzaman, Md, Goni, Md. Omaer Faruq, Ahsan, Mominul, Haider, Julfikar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9571784/
https://www.ncbi.nlm.nih.gov/pubmed/36236367
http://dx.doi.org/10.3390/s22197268
_version_ 1784810450018369536
author Kibria, Hafsa Binte
Nahiduzzaman, Md
Goni, Md. Omaer Faruq
Ahsan, Mominul
Haider, Julfikar
author_facet Kibria, Hafsa Binte
Nahiduzzaman, Md
Goni, Md. Omaer Faruq
Ahsan, Mominul
Haider, Julfikar
author_sort Kibria, Hafsa Binte
collection PubMed
description Diabetes is a chronic disease that continues to be a primary and worldwide health concern since the health of the entire population has been affected by it. Over the years, many academics have attempted to develop a reliable diabetes prediction model using machine learning (ML) algorithms. However, these research investigations have had a minimal impact on clinical practice as the current studies focus mainly on improving the performance of complicated ML models while ignoring their explainability to clinical situations. Therefore, the physicians find it difficult to understand these models and rarely trust them for clinical use. In this study, a carefully constructed, efficient, and interpretable diabetes detection method using an explainable AI has been proposed. The Pima Indian diabetes dataset was used, containing a total of 768 instances where 268 are diabetic, and 500 cases are non-diabetic with several diabetic attributes. Here, six machine learning algorithms (artificial neural network (ANN), random forest (RF), support vector machine (SVM), logistic regression (LR), AdaBoost, XGBoost) have been used along with an ensemble classifier to diagnose the diabetes disease. For each machine learning model, global and local explanations have been produced using the Shapley additive explanations (SHAP), which are represented in different types of graphs to help physicians in understanding the model predictions. The balanced accuracy of the developed weighted ensemble model was 90% with a F1 score of 89% using a five-fold cross-validation (CV). The median values were used for the imputation of the missing values and the synthetic minority oversampling technique (SMOTETomek) was used to balance the classes of the dataset. The proposed approach can improve the clinical understanding of a diabetes diagnosis and help in taking necessary action at the very early stages of the disease.
format Online
Article
Text
id pubmed-9571784
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-95717842022-10-17 An Ensemble Approach for the Prediction of Diabetes Mellitus Using a Soft Voting Classifier with an Explainable AI Kibria, Hafsa Binte Nahiduzzaman, Md Goni, Md. Omaer Faruq Ahsan, Mominul Haider, Julfikar Sensors (Basel) Article Diabetes is a chronic disease that continues to be a primary and worldwide health concern since the health of the entire population has been affected by it. Over the years, many academics have attempted to develop a reliable diabetes prediction model using machine learning (ML) algorithms. However, these research investigations have had a minimal impact on clinical practice as the current studies focus mainly on improving the performance of complicated ML models while ignoring their explainability to clinical situations. Therefore, the physicians find it difficult to understand these models and rarely trust them for clinical use. In this study, a carefully constructed, efficient, and interpretable diabetes detection method using an explainable AI has been proposed. The Pima Indian diabetes dataset was used, containing a total of 768 instances where 268 are diabetic, and 500 cases are non-diabetic with several diabetic attributes. Here, six machine learning algorithms (artificial neural network (ANN), random forest (RF), support vector machine (SVM), logistic regression (LR), AdaBoost, XGBoost) have been used along with an ensemble classifier to diagnose the diabetes disease. For each machine learning model, global and local explanations have been produced using the Shapley additive explanations (SHAP), which are represented in different types of graphs to help physicians in understanding the model predictions. The balanced accuracy of the developed weighted ensemble model was 90% with a F1 score of 89% using a five-fold cross-validation (CV). The median values were used for the imputation of the missing values and the synthetic minority oversampling technique (SMOTETomek) was used to balance the classes of the dataset. The proposed approach can improve the clinical understanding of a diabetes diagnosis and help in taking necessary action at the very early stages of the disease. MDPI 2022-09-25 /pmc/articles/PMC9571784/ /pubmed/36236367 http://dx.doi.org/10.3390/s22197268 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Kibria, Hafsa Binte
Nahiduzzaman, Md
Goni, Md. Omaer Faruq
Ahsan, Mominul
Haider, Julfikar
An Ensemble Approach for the Prediction of Diabetes Mellitus Using a Soft Voting Classifier with an Explainable AI
title An Ensemble Approach for the Prediction of Diabetes Mellitus Using a Soft Voting Classifier with an Explainable AI
title_full An Ensemble Approach for the Prediction of Diabetes Mellitus Using a Soft Voting Classifier with an Explainable AI
title_fullStr An Ensemble Approach for the Prediction of Diabetes Mellitus Using a Soft Voting Classifier with an Explainable AI
title_full_unstemmed An Ensemble Approach for the Prediction of Diabetes Mellitus Using a Soft Voting Classifier with an Explainable AI
title_short An Ensemble Approach for the Prediction of Diabetes Mellitus Using a Soft Voting Classifier with an Explainable AI
title_sort ensemble approach for the prediction of diabetes mellitus using a soft voting classifier with an explainable ai
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9571784/
https://www.ncbi.nlm.nih.gov/pubmed/36236367
http://dx.doi.org/10.3390/s22197268
work_keys_str_mv AT kibriahafsabinte anensembleapproachforthepredictionofdiabetesmellitususingasoftvotingclassifierwithanexplainableai
AT nahiduzzamanmd anensembleapproachforthepredictionofdiabetesmellitususingasoftvotingclassifierwithanexplainableai
AT gonimdomaerfaruq anensembleapproachforthepredictionofdiabetesmellitususingasoftvotingclassifierwithanexplainableai
AT ahsanmominul anensembleapproachforthepredictionofdiabetesmellitususingasoftvotingclassifierwithanexplainableai
AT haiderjulfikar anensembleapproachforthepredictionofdiabetesmellitususingasoftvotingclassifierwithanexplainableai
AT kibriahafsabinte ensembleapproachforthepredictionofdiabetesmellitususingasoftvotingclassifierwithanexplainableai
AT nahiduzzamanmd ensembleapproachforthepredictionofdiabetesmellitususingasoftvotingclassifierwithanexplainableai
AT gonimdomaerfaruq ensembleapproachforthepredictionofdiabetesmellitususingasoftvotingclassifierwithanexplainableai
AT ahsanmominul ensembleapproachforthepredictionofdiabetesmellitususingasoftvotingclassifierwithanexplainableai
AT haiderjulfikar ensembleapproachforthepredictionofdiabetesmellitususingasoftvotingclassifierwithanexplainableai