Cargando…

Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning

PURPOSE: This study sought to develop models with good identification for adverse outcomes in patients with heart failure (HF) and find strong factors that affect prognosis. PATIENTS AND METHODS: A total of 5004 qualifying cases were selected, among which 498 cases had adverse outcomes and 4506 case...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Ke, Tian, Jing, Zheng, Chu, Yang, Hong, Ren, Jia, Li, Chenhao, Han, Qinghua, Zhang, Yanbo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Dove 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8206455/
https://www.ncbi.nlm.nih.gov/pubmed/34149290
http://dx.doi.org/10.2147/RMHP.S310295
_version_ 1783708628776124416
author Wang, Ke
Tian, Jing
Zheng, Chu
Yang, Hong
Ren, Jia
Li, Chenhao
Han, Qinghua
Zhang, Yanbo
author_facet Wang, Ke
Tian, Jing
Zheng, Chu
Yang, Hong
Ren, Jia
Li, Chenhao
Han, Qinghua
Zhang, Yanbo
author_sort Wang, Ke
collection PubMed
description PURPOSE: This study sought to develop models with good identification for adverse outcomes in patients with heart failure (HF) and find strong factors that affect prognosis. PATIENTS AND METHODS: A total of 5004 qualifying cases were selected, among which 498 cases had adverse outcomes and 4506 cases were discharged after improvement. The study subjects were hospitalized patients diagnosed with HF from a regional cardiovascular hospital and the cardiology department of a medical university hospital in Shanxi Province of China between January 2014 and June 2019. Synthesizing minority oversampling technology combined with edited nearest neighbors (SMOTE+ENN) was used to pre-process unbalanced data. Traditional logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost) were used to build risk identification models, and each model was repeated 100 times. Model discrimination and calibration were estimated using F1-score, the area under the receiver-operating characteristic curve (AUROC), and Brier score. The best performing of the five models was used to identify the risk of adverse outcomes and evaluate the influencing factors. RESULTS: The SME-XGBoost was the best performing model with means of F1-score (0.3673, 95% confidence interval [CI]: 0.3633–0.3712), AUC (0.8010, CI: 0.7974–0.8046), and Brier score (0.1769, CI: 0.1748–0.1789). Age, N-terminal pronatriuretic peptide, pulmonary disease, etc. were the most significant factors of adverse outcomes in patients with HF. CONCLUSION: The combination of SMOTE+ENN and advanced machine learning methods effectively improved the discrimination efficacy of adverse outcomes in HF patients, accurately stratified patients at risk of adverse outcomes, and found the top factors of adverse outcomes. These models and factors emphasize the importance of health status data in determining adverse outcomes in patients with HF.
format Online
Article
Text
id pubmed-8206455
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Dove
record_format MEDLINE/PubMed
spelling pubmed-82064552021-06-17 Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning Wang, Ke Tian, Jing Zheng, Chu Yang, Hong Ren, Jia Li, Chenhao Han, Qinghua Zhang, Yanbo Risk Manag Healthc Policy Original Research PURPOSE: This study sought to develop models with good identification for adverse outcomes in patients with heart failure (HF) and find strong factors that affect prognosis. PATIENTS AND METHODS: A total of 5004 qualifying cases were selected, among which 498 cases had adverse outcomes and 4506 cases were discharged after improvement. The study subjects were hospitalized patients diagnosed with HF from a regional cardiovascular hospital and the cardiology department of a medical university hospital in Shanxi Province of China between January 2014 and June 2019. Synthesizing minority oversampling technology combined with edited nearest neighbors (SMOTE+ENN) was used to pre-process unbalanced data. Traditional logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost) were used to build risk identification models, and each model was repeated 100 times. Model discrimination and calibration were estimated using F1-score, the area under the receiver-operating characteristic curve (AUROC), and Brier score. The best performing of the five models was used to identify the risk of adverse outcomes and evaluate the influencing factors. RESULTS: The SME-XGBoost was the best performing model with means of F1-score (0.3673, 95% confidence interval [CI]: 0.3633–0.3712), AUC (0.8010, CI: 0.7974–0.8046), and Brier score (0.1769, CI: 0.1748–0.1789). Age, N-terminal pronatriuretic peptide, pulmonary disease, etc. were the most significant factors of adverse outcomes in patients with HF. CONCLUSION: The combination of SMOTE+ENN and advanced machine learning methods effectively improved the discrimination efficacy of adverse outcomes in HF patients, accurately stratified patients at risk of adverse outcomes, and found the top factors of adverse outcomes. These models and factors emphasize the importance of health status data in determining adverse outcomes in patients with HF. Dove 2021-06-08 /pmc/articles/PMC8206455/ /pubmed/34149290 http://dx.doi.org/10.2147/RMHP.S310295 Text en © 2021 Wang et al. https://creativecommons.org/licenses/by-nc/3.0/This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution – Non Commercial (unported, v3.0) License (http://creativecommons.org/licenses/by-nc/3.0/ (https://creativecommons.org/licenses/by-nc/3.0/) ). By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms (https://www.dovepress.com/terms.php).
spellingShingle Original Research
Wang, Ke
Tian, Jing
Zheng, Chu
Yang, Hong
Ren, Jia
Li, Chenhao
Han, Qinghua
Zhang, Yanbo
Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning
title Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning
title_full Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning
title_fullStr Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning
title_full_unstemmed Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning
title_short Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning
title_sort improving risk identification of adverse outcomes in chronic heart failure using smote+enn and machine learning
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8206455/
https://www.ncbi.nlm.nih.gov/pubmed/34149290
http://dx.doi.org/10.2147/RMHP.S310295
work_keys_str_mv AT wangke improvingriskidentificationofadverseoutcomesinchronicheartfailureusingsmoteennandmachinelearning
AT tianjing improvingriskidentificationofadverseoutcomesinchronicheartfailureusingsmoteennandmachinelearning
AT zhengchu improvingriskidentificationofadverseoutcomesinchronicheartfailureusingsmoteennandmachinelearning
AT yanghong improvingriskidentificationofadverseoutcomesinchronicheartfailureusingsmoteennandmachinelearning
AT renjia improvingriskidentificationofadverseoutcomesinchronicheartfailureusingsmoteennandmachinelearning
AT lichenhao improvingriskidentificationofadverseoutcomesinchronicheartfailureusingsmoteennandmachinelearning
AT hanqinghua improvingriskidentificationofadverseoutcomesinchronicheartfailureusingsmoteennandmachinelearning
AT zhangyanbo improvingriskidentificationofadverseoutcomesinchronicheartfailureusingsmoteennandmachinelearning