Cargando…

Predicting Chronic Kidney Disease Using Hybrid Machine Learning Based on Apache Spark

Chronic kidney disease (CKD) has become a widespread disease among people. It is related to various serious risks like cardiovascular disease, heightened risk, and end-stage renal disease, which can be feasibly avoidable by early detection and treatment of people in danger of this disease. The machi...

Descripción completa

Detalles Bibliográficos
Autores principales:	Abdel-Fattah, Manal A, Othman, Nermin Abdelhakim, Goher, Nagwa
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi 2022
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8890824/ https://www.ncbi.nlm.nih.gov/pubmed/35251161 http://dx.doi.org/10.1155/2022/9898831

_version_	1784661729332953088
author	Abdel-Fattah, Manal A Othman, Nermin Abdelhakim Goher, Nagwa
author_facet	Abdel-Fattah, Manal A Othman, Nermin Abdelhakim Goher, Nagwa
author_sort	Abdel-Fattah, Manal A
collection	PubMed
description	Chronic kidney disease (CKD) has become a widespread disease among people. It is related to various serious risks like cardiovascular disease, heightened risk, and end-stage renal disease, which can be feasibly avoidable by early detection and treatment of people in danger of this disease. The machine learning algorithm is a source of significant assistance for medical scientists to diagnose the disease accurately in its outset stage. Recently, Big Data platforms are integrated with machine learning algorithms to add value to healthcare. Therefore, this paper proposes hybrid machine learning techniques that include feature selection methods and machine learning classification algorithms based on big data platforms (Apache Spark) that were used to detect chronic kidney disease (CKD). The feature selection techniques, namely, Relief-F and chi-squared feature selection method, were applied to select the important features. Six machine learning classification algorithms were used in this research: decision tree (DT), logistic regression (LR), Naive Bayes (NB), Random Forest (RF), support vector machine (SVM), and Gradient-Boosted Trees (GBT Classifier) as ensemble learning algorithms. Four methods of evaluation, namely, accuracy, precision, recall, and F1-measure, were applied to validate the results. For each algorithm, the results of cross-validation and the testing results have been computed based on full features, the features selected by Relief-F, and the features selected by chi-squared feature selection method. The results showed that SVM, DT, and GBT Classifiers with the selected features had achieved the best performance at 100% accuracy. Overall, Relief-F's selected features are better than full features and the features selected by chi-square.
format	Online Article Text
id	pubmed-8890824
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Hindawi
record_format	MEDLINE/PubMed
spelling	pubmed-88908242022-03-03 Predicting Chronic Kidney Disease Using Hybrid Machine Learning Based on Apache Spark Abdel-Fattah, Manal A Othman, Nermin Abdelhakim Goher, Nagwa Comput Intell Neurosci Research Article Chronic kidney disease (CKD) has become a widespread disease among people. It is related to various serious risks like cardiovascular disease, heightened risk, and end-stage renal disease, which can be feasibly avoidable by early detection and treatment of people in danger of this disease. The machine learning algorithm is a source of significant assistance for medical scientists to diagnose the disease accurately in its outset stage. Recently, Big Data platforms are integrated with machine learning algorithms to add value to healthcare. Therefore, this paper proposes hybrid machine learning techniques that include feature selection methods and machine learning classification algorithms based on big data platforms (Apache Spark) that were used to detect chronic kidney disease (CKD). The feature selection techniques, namely, Relief-F and chi-squared feature selection method, were applied to select the important features. Six machine learning classification algorithms were used in this research: decision tree (DT), logistic regression (LR), Naive Bayes (NB), Random Forest (RF), support vector machine (SVM), and Gradient-Boosted Trees (GBT Classifier) as ensemble learning algorithms. Four methods of evaluation, namely, accuracy, precision, recall, and F1-measure, were applied to validate the results. For each algorithm, the results of cross-validation and the testing results have been computed based on full features, the features selected by Relief-F, and the features selected by chi-squared feature selection method. The results showed that SVM, DT, and GBT Classifiers with the selected features had achieved the best performance at 100% accuracy. Overall, Relief-F's selected features are better than full features and the features selected by chi-square. Hindawi 2022-02-23 /pmc/articles/PMC8890824/ /pubmed/35251161 http://dx.doi.org/10.1155/2022/9898831 Text en Copyright © 2022 Manal A Abdel-Fattah et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Abdel-Fattah, Manal A Othman, Nermin Abdelhakim Goher, Nagwa Predicting Chronic Kidney Disease Using Hybrid Machine Learning Based on Apache Spark
title	Predicting Chronic Kidney Disease Using Hybrid Machine Learning Based on Apache Spark
title_full	Predicting Chronic Kidney Disease Using Hybrid Machine Learning Based on Apache Spark
title_fullStr	Predicting Chronic Kidney Disease Using Hybrid Machine Learning Based on Apache Spark
title_full_unstemmed	Predicting Chronic Kidney Disease Using Hybrid Machine Learning Based on Apache Spark
title_short	Predicting Chronic Kidney Disease Using Hybrid Machine Learning Based on Apache Spark
title_sort	predicting chronic kidney disease using hybrid machine learning based on apache spark
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8890824/ https://www.ncbi.nlm.nih.gov/pubmed/35251161 http://dx.doi.org/10.1155/2022/9898831
work_keys_str_mv	AT abdelfattahmanala predictingchronickidneydiseaseusinghybridmachinelearningbasedonapachespark AT othmannerminabdelhakim predictingchronickidneydiseaseusinghybridmachinelearningbasedonapachespark AT gohernagwa predictingchronickidneydiseaseusinghybridmachinelearningbasedonapachespark

Predicting Chronic Kidney Disease Using Hybrid Machine Learning Based on Apache Spark

Ejemplares similares