Cargando…

Application of supervised machine learning algorithms for classification and prediction of type-2 diabetes disease status in Afar regional state, Northeastern Ethiopia 2021

Ethiopia has been challenged by the growing magnitude of diabetes in general and type-2 diabetes in particular. Knowledge extraction from stored dataset can be an important base for better decision on diabetes rapid diagnosis, suggestive on prediction for early intervention. Thus, this study was add...

Descripción completa

Detalles Bibliográficos
Autores principales: Ebrahim, Oumer Abdulkadir, Derbew, Getachew
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10182985/
https://www.ncbi.nlm.nih.gov/pubmed/37179444
http://dx.doi.org/10.1038/s41598-023-34906-1
_version_ 1785041861521440768
author Ebrahim, Oumer Abdulkadir
Derbew, Getachew
author_facet Ebrahim, Oumer Abdulkadir
Derbew, Getachew
author_sort Ebrahim, Oumer Abdulkadir
collection PubMed
description Ethiopia has been challenged by the growing magnitude of diabetes in general and type-2 diabetes in particular. Knowledge extraction from stored dataset can be an important base for better decision on diabetes rapid diagnosis, suggestive on prediction for early intervention. Thus, this study was addressed these problem by application of supervised machine learning algorithms for classification and prediction of type 2 diabetes disease status and might provide context-specific information to program planners and policy makers so that, priority will be given to the more affected groups. To apply supervised machine learning algorithms; compare these algorithms and select the best algorithm based on their performance for classification and prediction of type-2 diabetic disease status (positive or negative) in public hospitals of Afar regional state, Northeastern Ethiopia. This study was conducted at Afar regional state from February to June of 2021. Decision tree; pruned J 48, Artificial neural network, K-nearest neighbor, Support vector machine, Binary logistic regression, Random forest, and Naïve Bayes supervised machine learning algorithms were applied using secondary data from the medical database record review. A total of 2239 sample Dataset diagnosed for diabetes from 2012 to April 22/2020 (1523 with type-2 diabetes and 716 without type-2 diabetes) was checked for its completeness prior to analysis. For all algorithms, WEKA3.7 tool was used for analysis purposes. Moreover, all algorithms were compared based on their correctly classification rate, kappa statistics, confusion matrix, area under the curve, sensitivity, and specificity. From the seven major supervised machine learning algorithms, the best classification and prediction results were obtained from random forest [correctly classified rate (93.8%), kappa statistics (0.85), sensitivity (0.98), area under the curve (0.97) and confusion matrix (out of 454 actual positive prediction for 446)] which was followed by decision tree pruned J 48 [correctly classified rate (91.8%), kappa statistics (0.80), sensitivity (0.96), area under the curve (0.91) and confusion matrices (out of 454 actual positive prediction for 438)] and k-nearest neighbor [correctly classified rate (89.8%), kappa statistics (0.76), sensitivity (0.92), area under the curve (0.88) and confusion matrices (out of 454 actual positive prediction for 421)]. Random forest, Decision tree pruned J48 and k-nearest neighbor algorithms have better classification and prediction performance for classifying and predicting type-2 diabetes disease status. Therefore, based on this performance, random forest algorithm can be judged as suggestive and supportive for clinicians at the time of type-2 diabetes diagnosis.
format Online
Article
Text
id pubmed-10182985
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-101829852023-05-15 Application of supervised machine learning algorithms for classification and prediction of type-2 diabetes disease status in Afar regional state, Northeastern Ethiopia 2021 Ebrahim, Oumer Abdulkadir Derbew, Getachew Sci Rep Article Ethiopia has been challenged by the growing magnitude of diabetes in general and type-2 diabetes in particular. Knowledge extraction from stored dataset can be an important base for better decision on diabetes rapid diagnosis, suggestive on prediction for early intervention. Thus, this study was addressed these problem by application of supervised machine learning algorithms for classification and prediction of type 2 diabetes disease status and might provide context-specific information to program planners and policy makers so that, priority will be given to the more affected groups. To apply supervised machine learning algorithms; compare these algorithms and select the best algorithm based on their performance for classification and prediction of type-2 diabetic disease status (positive or negative) in public hospitals of Afar regional state, Northeastern Ethiopia. This study was conducted at Afar regional state from February to June of 2021. Decision tree; pruned J 48, Artificial neural network, K-nearest neighbor, Support vector machine, Binary logistic regression, Random forest, and Naïve Bayes supervised machine learning algorithms were applied using secondary data from the medical database record review. A total of 2239 sample Dataset diagnosed for diabetes from 2012 to April 22/2020 (1523 with type-2 diabetes and 716 without type-2 diabetes) was checked for its completeness prior to analysis. For all algorithms, WEKA3.7 tool was used for analysis purposes. Moreover, all algorithms were compared based on their correctly classification rate, kappa statistics, confusion matrix, area under the curve, sensitivity, and specificity. From the seven major supervised machine learning algorithms, the best classification and prediction results were obtained from random forest [correctly classified rate (93.8%), kappa statistics (0.85), sensitivity (0.98), area under the curve (0.97) and confusion matrix (out of 454 actual positive prediction for 446)] which was followed by decision tree pruned J 48 [correctly classified rate (91.8%), kappa statistics (0.80), sensitivity (0.96), area under the curve (0.91) and confusion matrices (out of 454 actual positive prediction for 438)] and k-nearest neighbor [correctly classified rate (89.8%), kappa statistics (0.76), sensitivity (0.92), area under the curve (0.88) and confusion matrices (out of 454 actual positive prediction for 421)]. Random forest, Decision tree pruned J48 and k-nearest neighbor algorithms have better classification and prediction performance for classifying and predicting type-2 diabetes disease status. Therefore, based on this performance, random forest algorithm can be judged as suggestive and supportive for clinicians at the time of type-2 diabetes diagnosis. Nature Publishing Group UK 2023-05-13 /pmc/articles/PMC10182985/ /pubmed/37179444 http://dx.doi.org/10.1038/s41598-023-34906-1 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Ebrahim, Oumer Abdulkadir
Derbew, Getachew
Application of supervised machine learning algorithms for classification and prediction of type-2 diabetes disease status in Afar regional state, Northeastern Ethiopia 2021
title Application of supervised machine learning algorithms for classification and prediction of type-2 diabetes disease status in Afar regional state, Northeastern Ethiopia 2021
title_full Application of supervised machine learning algorithms for classification and prediction of type-2 diabetes disease status in Afar regional state, Northeastern Ethiopia 2021
title_fullStr Application of supervised machine learning algorithms for classification and prediction of type-2 diabetes disease status in Afar regional state, Northeastern Ethiopia 2021
title_full_unstemmed Application of supervised machine learning algorithms for classification and prediction of type-2 diabetes disease status in Afar regional state, Northeastern Ethiopia 2021
title_short Application of supervised machine learning algorithms for classification and prediction of type-2 diabetes disease status in Afar regional state, Northeastern Ethiopia 2021
title_sort application of supervised machine learning algorithms for classification and prediction of type-2 diabetes disease status in afar regional state, northeastern ethiopia 2021
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10182985/
https://www.ncbi.nlm.nih.gov/pubmed/37179444
http://dx.doi.org/10.1038/s41598-023-34906-1
work_keys_str_mv AT ebrahimoumerabdulkadir applicationofsupervisedmachinelearningalgorithmsforclassificationandpredictionoftype2diabetesdiseasestatusinafarregionalstatenortheasternethiopia2021
AT derbewgetachew applicationofsupervisedmachinelearningalgorithmsforclassificationandpredictionoftype2diabetesdiseasestatusinafarregionalstatenortheasternethiopia2021