Cargando…
Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning
PURPOSE: This study sought to develop models with good identification for adverse outcomes in patients with heart failure (HF) and find strong factors that affect prognosis. PATIENTS AND METHODS: A total of 5004 qualifying cases were selected, among which 498 cases had adverse outcomes and 4506 case...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Dove
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8206455/ https://www.ncbi.nlm.nih.gov/pubmed/34149290 http://dx.doi.org/10.2147/RMHP.S310295 |
_version_ | 1783708628776124416 |
---|---|
author | Wang, Ke Tian, Jing Zheng, Chu Yang, Hong Ren, Jia Li, Chenhao Han, Qinghua Zhang, Yanbo |
author_facet | Wang, Ke Tian, Jing Zheng, Chu Yang, Hong Ren, Jia Li, Chenhao Han, Qinghua Zhang, Yanbo |
author_sort | Wang, Ke |
collection | PubMed |
description | PURPOSE: This study sought to develop models with good identification for adverse outcomes in patients with heart failure (HF) and find strong factors that affect prognosis. PATIENTS AND METHODS: A total of 5004 qualifying cases were selected, among which 498 cases had adverse outcomes and 4506 cases were discharged after improvement. The study subjects were hospitalized patients diagnosed with HF from a regional cardiovascular hospital and the cardiology department of a medical university hospital in Shanxi Province of China between January 2014 and June 2019. Synthesizing minority oversampling technology combined with edited nearest neighbors (SMOTE+ENN) was used to pre-process unbalanced data. Traditional logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost) were used to build risk identification models, and each model was repeated 100 times. Model discrimination and calibration were estimated using F1-score, the area under the receiver-operating characteristic curve (AUROC), and Brier score. The best performing of the five models was used to identify the risk of adverse outcomes and evaluate the influencing factors. RESULTS: The SME-XGBoost was the best performing model with means of F1-score (0.3673, 95% confidence interval [CI]: 0.3633–0.3712), AUC (0.8010, CI: 0.7974–0.8046), and Brier score (0.1769, CI: 0.1748–0.1789). Age, N-terminal pronatriuretic peptide, pulmonary disease, etc. were the most significant factors of adverse outcomes in patients with HF. CONCLUSION: The combination of SMOTE+ENN and advanced machine learning methods effectively improved the discrimination efficacy of adverse outcomes in HF patients, accurately stratified patients at risk of adverse outcomes, and found the top factors of adverse outcomes. These models and factors emphasize the importance of health status data in determining adverse outcomes in patients with HF. |
format | Online Article Text |
id | pubmed-8206455 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Dove |
record_format | MEDLINE/PubMed |
spelling | pubmed-82064552021-06-17 Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning Wang, Ke Tian, Jing Zheng, Chu Yang, Hong Ren, Jia Li, Chenhao Han, Qinghua Zhang, Yanbo Risk Manag Healthc Policy Original Research PURPOSE: This study sought to develop models with good identification for adverse outcomes in patients with heart failure (HF) and find strong factors that affect prognosis. PATIENTS AND METHODS: A total of 5004 qualifying cases were selected, among which 498 cases had adverse outcomes and 4506 cases were discharged after improvement. The study subjects were hospitalized patients diagnosed with HF from a regional cardiovascular hospital and the cardiology department of a medical university hospital in Shanxi Province of China between January 2014 and June 2019. Synthesizing minority oversampling technology combined with edited nearest neighbors (SMOTE+ENN) was used to pre-process unbalanced data. Traditional logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost) were used to build risk identification models, and each model was repeated 100 times. Model discrimination and calibration were estimated using F1-score, the area under the receiver-operating characteristic curve (AUROC), and Brier score. The best performing of the five models was used to identify the risk of adverse outcomes and evaluate the influencing factors. RESULTS: The SME-XGBoost was the best performing model with means of F1-score (0.3673, 95% confidence interval [CI]: 0.3633–0.3712), AUC (0.8010, CI: 0.7974–0.8046), and Brier score (0.1769, CI: 0.1748–0.1789). Age, N-terminal pronatriuretic peptide, pulmonary disease, etc. were the most significant factors of adverse outcomes in patients with HF. CONCLUSION: The combination of SMOTE+ENN and advanced machine learning methods effectively improved the discrimination efficacy of adverse outcomes in HF patients, accurately stratified patients at risk of adverse outcomes, and found the top factors of adverse outcomes. These models and factors emphasize the importance of health status data in determining adverse outcomes in patients with HF. Dove 2021-06-08 /pmc/articles/PMC8206455/ /pubmed/34149290 http://dx.doi.org/10.2147/RMHP.S310295 Text en © 2021 Wang et al. https://creativecommons.org/licenses/by-nc/3.0/This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution – Non Commercial (unported, v3.0) License (http://creativecommons.org/licenses/by-nc/3.0/ (https://creativecommons.org/licenses/by-nc/3.0/) ). By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms (https://www.dovepress.com/terms.php). |
spellingShingle | Original Research Wang, Ke Tian, Jing Zheng, Chu Yang, Hong Ren, Jia Li, Chenhao Han, Qinghua Zhang, Yanbo Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning |
title | Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning |
title_full | Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning |
title_fullStr | Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning |
title_full_unstemmed | Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning |
title_short | Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning |
title_sort | improving risk identification of adverse outcomes in chronic heart failure using smote+enn and machine learning |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8206455/ https://www.ncbi.nlm.nih.gov/pubmed/34149290 http://dx.doi.org/10.2147/RMHP.S310295 |
work_keys_str_mv | AT wangke improvingriskidentificationofadverseoutcomesinchronicheartfailureusingsmoteennandmachinelearning AT tianjing improvingriskidentificationofadverseoutcomesinchronicheartfailureusingsmoteennandmachinelearning AT zhengchu improvingriskidentificationofadverseoutcomesinchronicheartfailureusingsmoteennandmachinelearning AT yanghong improvingriskidentificationofadverseoutcomesinchronicheartfailureusingsmoteennandmachinelearning AT renjia improvingriskidentificationofadverseoutcomesinchronicheartfailureusingsmoteennandmachinelearning AT lichenhao improvingriskidentificationofadverseoutcomesinchronicheartfailureusingsmoteennandmachinelearning AT hanqinghua improvingriskidentificationofadverseoutcomesinchronicheartfailureusingsmoteennandmachinelearning AT zhangyanbo improvingriskidentificationofadverseoutcomesinchronicheartfailureusingsmoteennandmachinelearning |