Cargando…

Score and Correlation Coefficient-Based Feature Selection for Predicting Heart Failure Diagnosis by Using Machine Learning Algorithms

Cardiovascular disease (CVD) is one of the most common causes of death that kills approximately 17 million people annually. The main reasons behind CVD are myocardial infarction and the failure of the heart to pump blood normally. Doctors could diagnose heart failure (HF) through electronic medical...

Descripción completa

Detalles Bibliográficos
Autores principales: Senan, Ebrahim Mohammed, Abunadi, Ibrahim, Jadhav, Mukti E., Fati, Suliman Mohamed
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8712170/
https://www.ncbi.nlm.nih.gov/pubmed/34966445
http://dx.doi.org/10.1155/2021/8500314
_version_ 1784623508706295808
author Senan, Ebrahim Mohammed
Abunadi, Ibrahim
Jadhav, Mukti E.
Fati, Suliman Mohamed
author_facet Senan, Ebrahim Mohammed
Abunadi, Ibrahim
Jadhav, Mukti E.
Fati, Suliman Mohamed
author_sort Senan, Ebrahim Mohammed
collection PubMed
description Cardiovascular disease (CVD) is one of the most common causes of death that kills approximately 17 million people annually. The main reasons behind CVD are myocardial infarction and the failure of the heart to pump blood normally. Doctors could diagnose heart failure (HF) through electronic medical records on the basis of patient's symptoms and clinical laboratory investigations. However, accurate diagnosis of HF requires medical resources and expert practitioners that are not always available, thus making the diagnosing challengeable. Therefore, predicting the patients' condition by using machine learning algorithms is a necessity to save time and efforts. This paper proposed a machine-learning-based approach that distinguishes the most important correlated features amongst patients' electronic clinical records. The SelectKBest function was applied with chi-squared statistical method to determine the most important features, and then feature engineering method has been applied to create new features correlated strongly in order to train machine learning models and obtain promising results. Optimised hyperparameter classification algorithms SVM, KNN, Decision Tree, Random Forest, and Logistic Regression were used to train two different datasets. The first dataset, called Cleveland, consisted of 303 records. The second dataset, which was used for predicting HF, consisted of 299 records. Experimental results showed that the Random Forest algorithm achieved accuracy, precision, recall, and F1 scores of 95%, 97.62%, 95.35%, and 96.47%, respectively, during the test phase for the second dataset. The same algorithm achieved accuracy scores of 100% for the first dataset and 97.68% for the second dataset, while 100% precision, recall, and F1 scores were reached for both datasets.
format Online
Article
Text
id pubmed-8712170
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-87121702021-12-28 Score and Correlation Coefficient-Based Feature Selection for Predicting Heart Failure Diagnosis by Using Machine Learning Algorithms Senan, Ebrahim Mohammed Abunadi, Ibrahim Jadhav, Mukti E. Fati, Suliman Mohamed Comput Math Methods Med Research Article Cardiovascular disease (CVD) is one of the most common causes of death that kills approximately 17 million people annually. The main reasons behind CVD are myocardial infarction and the failure of the heart to pump blood normally. Doctors could diagnose heart failure (HF) through electronic medical records on the basis of patient's symptoms and clinical laboratory investigations. However, accurate diagnosis of HF requires medical resources and expert practitioners that are not always available, thus making the diagnosing challengeable. Therefore, predicting the patients' condition by using machine learning algorithms is a necessity to save time and efforts. This paper proposed a machine-learning-based approach that distinguishes the most important correlated features amongst patients' electronic clinical records. The SelectKBest function was applied with chi-squared statistical method to determine the most important features, and then feature engineering method has been applied to create new features correlated strongly in order to train machine learning models and obtain promising results. Optimised hyperparameter classification algorithms SVM, KNN, Decision Tree, Random Forest, and Logistic Regression were used to train two different datasets. The first dataset, called Cleveland, consisted of 303 records. The second dataset, which was used for predicting HF, consisted of 299 records. Experimental results showed that the Random Forest algorithm achieved accuracy, precision, recall, and F1 scores of 95%, 97.62%, 95.35%, and 96.47%, respectively, during the test phase for the second dataset. The same algorithm achieved accuracy scores of 100% for the first dataset and 97.68% for the second dataset, while 100% precision, recall, and F1 scores were reached for both datasets. Hindawi 2021-12-20 /pmc/articles/PMC8712170/ /pubmed/34966445 http://dx.doi.org/10.1155/2021/8500314 Text en Copyright © 2021 Ebrahim Mohammed Senan et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Senan, Ebrahim Mohammed
Abunadi, Ibrahim
Jadhav, Mukti E.
Fati, Suliman Mohamed
Score and Correlation Coefficient-Based Feature Selection for Predicting Heart Failure Diagnosis by Using Machine Learning Algorithms
title Score and Correlation Coefficient-Based Feature Selection for Predicting Heart Failure Diagnosis by Using Machine Learning Algorithms
title_full Score and Correlation Coefficient-Based Feature Selection for Predicting Heart Failure Diagnosis by Using Machine Learning Algorithms
title_fullStr Score and Correlation Coefficient-Based Feature Selection for Predicting Heart Failure Diagnosis by Using Machine Learning Algorithms
title_full_unstemmed Score and Correlation Coefficient-Based Feature Selection for Predicting Heart Failure Diagnosis by Using Machine Learning Algorithms
title_short Score and Correlation Coefficient-Based Feature Selection for Predicting Heart Failure Diagnosis by Using Machine Learning Algorithms
title_sort score and correlation coefficient-based feature selection for predicting heart failure diagnosis by using machine learning algorithms
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8712170/
https://www.ncbi.nlm.nih.gov/pubmed/34966445
http://dx.doi.org/10.1155/2021/8500314
work_keys_str_mv AT senanebrahimmohammed scoreandcorrelationcoefficientbasedfeatureselectionforpredictingheartfailurediagnosisbyusingmachinelearningalgorithms
AT abunadiibrahim scoreandcorrelationcoefficientbasedfeatureselectionforpredictingheartfailurediagnosisbyusingmachinelearningalgorithms
AT jadhavmuktie scoreandcorrelationcoefficientbasedfeatureselectionforpredictingheartfailurediagnosisbyusingmachinelearningalgorithms
AT fatisulimanmohamed scoreandcorrelationcoefficientbasedfeatureselectionforpredictingheartfailurediagnosisbyusingmachinelearningalgorithms