Cargando…
Detection of the chronic kidney disease using XGBoost classifier and explaining the influence of the attributes on the model using SHAP
Chronic kidney disease (CKD) is a condition distinguished by structural and functional changes to the kidney over time. Studies show that 10% of adults worldwide are affected by some kind of CKD, resulting in 1.2 million deaths. Recently, CKD has emerged as a leading cause of mortality worldwide, ma...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10110580/ https://www.ncbi.nlm.nih.gov/pubmed/37069256 http://dx.doi.org/10.1038/s41598-023-33525-0 |
_version_ | 1785027290366738432 |
---|---|
author | Raihan, Md. Johir Khan, Md. Al-Masrur Kee, Seong-Hoon Nahid, Abdullah-Al |
author_facet | Raihan, Md. Johir Khan, Md. Al-Masrur Kee, Seong-Hoon Nahid, Abdullah-Al |
author_sort | Raihan, Md. Johir |
collection | PubMed |
description | Chronic kidney disease (CKD) is a condition distinguished by structural and functional changes to the kidney over time. Studies show that 10% of adults worldwide are affected by some kind of CKD, resulting in 1.2 million deaths. Recently, CKD has emerged as a leading cause of mortality worldwide, making it necessary to develop a Computer-Aided Diagnostic (CAD) system to diagnose CKD automatically. Machine Learning (ML) based CAD system can be used by a clinician to automatically diagnoses mass people. Since ML models are considered a black box, it is also necessary to expose influential causes behind a model's prediction of a particular output. So that, a doctor can make a more rational decision based on the model's output and analysis of the features influence on the model. In this paper, we have used the XGBoost as the ML classifier to predict whether a patient has CKD or not. Using the XGBoost classifier, we have obtained an accuracy, precision, recall, and F1 score of [Formula: see text] and [Formula: see text] respectively using all [Formula: see text] features. Furthermore, we have used Biogeography Based Optimization (BBO) algorithm to find an effective subset of the features. The BBO algorithm selected almost half of the initial features. We have obtained an accuracy, precision, recall, and F1 score of [Formula: see text] and [Formula: see text] respectively using only 13 features selected by the BBO algorithm. Finally, we have explained the impact of the feature on the ML models using the SHapley Additive exPlanations (SHAP) analysis. Using SHAP analysis and BBO algorithm, we have found that hemoglobin and albumin mostly contribute to the detection of CKD. |
format | Online Article Text |
id | pubmed-10110580 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-101105802023-04-19 Detection of the chronic kidney disease using XGBoost classifier and explaining the influence of the attributes on the model using SHAP Raihan, Md. Johir Khan, Md. Al-Masrur Kee, Seong-Hoon Nahid, Abdullah-Al Sci Rep Article Chronic kidney disease (CKD) is a condition distinguished by structural and functional changes to the kidney over time. Studies show that 10% of adults worldwide are affected by some kind of CKD, resulting in 1.2 million deaths. Recently, CKD has emerged as a leading cause of mortality worldwide, making it necessary to develop a Computer-Aided Diagnostic (CAD) system to diagnose CKD automatically. Machine Learning (ML) based CAD system can be used by a clinician to automatically diagnoses mass people. Since ML models are considered a black box, it is also necessary to expose influential causes behind a model's prediction of a particular output. So that, a doctor can make a more rational decision based on the model's output and analysis of the features influence on the model. In this paper, we have used the XGBoost as the ML classifier to predict whether a patient has CKD or not. Using the XGBoost classifier, we have obtained an accuracy, precision, recall, and F1 score of [Formula: see text] and [Formula: see text] respectively using all [Formula: see text] features. Furthermore, we have used Biogeography Based Optimization (BBO) algorithm to find an effective subset of the features. The BBO algorithm selected almost half of the initial features. We have obtained an accuracy, precision, recall, and F1 score of [Formula: see text] and [Formula: see text] respectively using only 13 features selected by the BBO algorithm. Finally, we have explained the impact of the feature on the ML models using the SHapley Additive exPlanations (SHAP) analysis. Using SHAP analysis and BBO algorithm, we have found that hemoglobin and albumin mostly contribute to the detection of CKD. Nature Publishing Group UK 2023-04-17 /pmc/articles/PMC10110580/ /pubmed/37069256 http://dx.doi.org/10.1038/s41598-023-33525-0 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Raihan, Md. Johir Khan, Md. Al-Masrur Kee, Seong-Hoon Nahid, Abdullah-Al Detection of the chronic kidney disease using XGBoost classifier and explaining the influence of the attributes on the model using SHAP |
title | Detection of the chronic kidney disease using XGBoost classifier and explaining the influence of the attributes on the model using SHAP |
title_full | Detection of the chronic kidney disease using XGBoost classifier and explaining the influence of the attributes on the model using SHAP |
title_fullStr | Detection of the chronic kidney disease using XGBoost classifier and explaining the influence of the attributes on the model using SHAP |
title_full_unstemmed | Detection of the chronic kidney disease using XGBoost classifier and explaining the influence of the attributes on the model using SHAP |
title_short | Detection of the chronic kidney disease using XGBoost classifier and explaining the influence of the attributes on the model using SHAP |
title_sort | detection of the chronic kidney disease using xgboost classifier and explaining the influence of the attributes on the model using shap |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10110580/ https://www.ncbi.nlm.nih.gov/pubmed/37069256 http://dx.doi.org/10.1038/s41598-023-33525-0 |
work_keys_str_mv | AT raihanmdjohir detectionofthechronickidneydiseaseusingxgboostclassifierandexplainingtheinfluenceoftheattributesonthemodelusingshap AT khanmdalmasrur detectionofthechronickidneydiseaseusingxgboostclassifierandexplainingtheinfluenceoftheattributesonthemodelusingshap AT keeseonghoon detectionofthechronickidneydiseaseusingxgboostclassifierandexplainingtheinfluenceoftheattributesonthemodelusingshap AT nahidabdullahal detectionofthechronickidneydiseaseusingxgboostclassifierandexplainingtheinfluenceoftheattributesonthemodelusingshap |