Cargando…

The Comprehensive Machine Learning Analytics for Heart Failure

Background: Early detection of heart failure is the basis for better medical treatment and prognosis. Over the last decades, both prevalence and incidence rates of heart failure have increased worldwide, resulting in a significant global public health issue. However, an early diagnosis is not an eas...

Descripción completa

Detalles Bibliográficos
Autores principales: Guo, Chao-Yu, Wu, Min-Yang, Cheng, Hao-Min
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8124765/
https://www.ncbi.nlm.nih.gov/pubmed/34066464
http://dx.doi.org/10.3390/ijerph18094943
_version_ 1783693303030480896
author Guo, Chao-Yu
Wu, Min-Yang
Cheng, Hao-Min
author_facet Guo, Chao-Yu
Wu, Min-Yang
Cheng, Hao-Min
author_sort Guo, Chao-Yu
collection PubMed
description Background: Early detection of heart failure is the basis for better medical treatment and prognosis. Over the last decades, both prevalence and incidence rates of heart failure have increased worldwide, resulting in a significant global public health issue. However, an early diagnosis is not an easy task because symptoms of heart failure are usually non-specific. Therefore, this study aims to develop a risk prediction model for incident heart failure through a machine learning-based predictive model. Although African Americans have a higher risk of incident heart failure among all populations, few studies have developed a heart failure risk prediction model for African Americans. Methods: This research implemented the Least Absolute Shrinkage and Selection Operator (LASSO) logistic regression, support vector machine, random forest, and Extreme Gradient Boosting (XGBoost) to establish the Jackson Heart Study’s predictive model. In the analysis of real data, missing data are problematic when building a predictive model. Here, we evaluate predictors’ inclusion with various missing rates and different missing imputation strategies to discover the optimal analytics. Results: According to hundreds of models that we examined, the best predictive model was the XGBoost that included variables with a missing rate of less than 30 percent, and we imputed missing values by non-parametric random forest imputation. The optimal XGBoost machine demonstrated an Area Under Curve (AUC) of 0.8409 to predict heart failure for the Jackson Heart Study. Conclusion: This research identifies variations of diabetes medication as the most crucial risk factor for heart failure compared to the complete cases approach that failed to discover this phenomenon.
format Online
Article
Text
id pubmed-8124765
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-81247652021-05-17 The Comprehensive Machine Learning Analytics for Heart Failure Guo, Chao-Yu Wu, Min-Yang Cheng, Hao-Min Int J Environ Res Public Health Article Background: Early detection of heart failure is the basis for better medical treatment and prognosis. Over the last decades, both prevalence and incidence rates of heart failure have increased worldwide, resulting in a significant global public health issue. However, an early diagnosis is not an easy task because symptoms of heart failure are usually non-specific. Therefore, this study aims to develop a risk prediction model for incident heart failure through a machine learning-based predictive model. Although African Americans have a higher risk of incident heart failure among all populations, few studies have developed a heart failure risk prediction model for African Americans. Methods: This research implemented the Least Absolute Shrinkage and Selection Operator (LASSO) logistic regression, support vector machine, random forest, and Extreme Gradient Boosting (XGBoost) to establish the Jackson Heart Study’s predictive model. In the analysis of real data, missing data are problematic when building a predictive model. Here, we evaluate predictors’ inclusion with various missing rates and different missing imputation strategies to discover the optimal analytics. Results: According to hundreds of models that we examined, the best predictive model was the XGBoost that included variables with a missing rate of less than 30 percent, and we imputed missing values by non-parametric random forest imputation. The optimal XGBoost machine demonstrated an Area Under Curve (AUC) of 0.8409 to predict heart failure for the Jackson Heart Study. Conclusion: This research identifies variations of diabetes medication as the most crucial risk factor for heart failure compared to the complete cases approach that failed to discover this phenomenon. MDPI 2021-05-06 /pmc/articles/PMC8124765/ /pubmed/34066464 http://dx.doi.org/10.3390/ijerph18094943 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Guo, Chao-Yu
Wu, Min-Yang
Cheng, Hao-Min
The Comprehensive Machine Learning Analytics for Heart Failure
title The Comprehensive Machine Learning Analytics for Heart Failure
title_full The Comprehensive Machine Learning Analytics for Heart Failure
title_fullStr The Comprehensive Machine Learning Analytics for Heart Failure
title_full_unstemmed The Comprehensive Machine Learning Analytics for Heart Failure
title_short The Comprehensive Machine Learning Analytics for Heart Failure
title_sort comprehensive machine learning analytics for heart failure
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8124765/
https://www.ncbi.nlm.nih.gov/pubmed/34066464
http://dx.doi.org/10.3390/ijerph18094943
work_keys_str_mv AT guochaoyu thecomprehensivemachinelearninganalyticsforheartfailure
AT wuminyang thecomprehensivemachinelearninganalyticsforheartfailure
AT chenghaomin thecomprehensivemachinelearninganalyticsforheartfailure
AT guochaoyu comprehensivemachinelearninganalyticsforheartfailure
AT wuminyang comprehensivemachinelearninganalyticsforheartfailure
AT chenghaomin comprehensivemachinelearninganalyticsforheartfailure