Cargando…
Development of Machine Learning Tools for Predicting Coronary Artery Disease in the Chinese Population
PURPOSE: Coronary artery disease (CAD) is one of the major cardiovascular diseases and the leading cause of death globally. Blood lipid profile is associated with CAD early risk. Therefore, we aim to establish machine learning models utilizing blood lipid profile to predict CAD risk. METHODS: In thi...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9691305/ https://www.ncbi.nlm.nih.gov/pubmed/36438901 http://dx.doi.org/10.1155/2022/6030254 |
_version_ | 1784837010680184832 |
---|---|
author | Zhang, Tiexu Huang, Shengming Xie, Pengfei Li, Xiaoming Pan, Yingxia Xu, Yue Han, Peng Ding, Feifei Zhao, Jiangman Tang, Hui |
author_facet | Zhang, Tiexu Huang, Shengming Xie, Pengfei Li, Xiaoming Pan, Yingxia Xu, Yue Han, Peng Ding, Feifei Zhao, Jiangman Tang, Hui |
author_sort | Zhang, Tiexu |
collection | PubMed |
description | PURPOSE: Coronary artery disease (CAD) is one of the major cardiovascular diseases and the leading cause of death globally. Blood lipid profile is associated with CAD early risk. Therefore, we aim to establish machine learning models utilizing blood lipid profile to predict CAD risk. METHODS: In this study, 193 non-CAD controls and 2001 newly-diagnosed CAD patients (1647 CAD patients who received lipid-lowering therapy and 354 who did not) were recruited. Clinical data and the result of routine blood lipids tests were collected. Moreover, low-density lipoprotein cholesterol (LDL-C) subfractions (LDLC-1 to LDLC-7) were classified and quantified using the Lipoprint system. Six predictive models (k-nearest neighbor classifier (KNN), logistic regression (LR), support vector machine (SVM), decision tree (DT), multilayer perceptron (MLP), and extreme gradient boosting (XGBoost)) were established and evaluated by the confusion matrix, area under the receiver operating characteristic (ROC) curve (AUC), recall (sensitivity), accuracy, precision, and F1 score. The selected features were analyzed and ranked. RESULTS: While predicting the CAD development risk of the CAD patients without lipid-lowering therapy in the test set, all models obtained AUC values above 0.94, and the accuracy, precision, recall, and F1 score were above 0.84, 0.85, 0.92, and 0.88, respectively. While predicting the CAD development risk of all CAD patients in the test set, all models obtained AUC values above 0.91, and the accuracy, precision, recall, and F1 score were above 0.87, 0.94, 0.87, and 0.92, respectively. Importantly, small dense LDL-C (sdLDL-C) and LDLC-4 play pivotal roles in predicting CAD risk. CONCLUSIONS: In the present study, machine learning tools combining both clinical data and blood lipid profile showed excellent overall predictive power. It suggests that machine learning tools are suitable for predicting the risk of CAD development in the near future. |
format | Online Article Text |
id | pubmed-9691305 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-96913052022-11-25 Development of Machine Learning Tools for Predicting Coronary Artery Disease in the Chinese Population Zhang, Tiexu Huang, Shengming Xie, Pengfei Li, Xiaoming Pan, Yingxia Xu, Yue Han, Peng Ding, Feifei Zhao, Jiangman Tang, Hui Dis Markers Research Article PURPOSE: Coronary artery disease (CAD) is one of the major cardiovascular diseases and the leading cause of death globally. Blood lipid profile is associated with CAD early risk. Therefore, we aim to establish machine learning models utilizing blood lipid profile to predict CAD risk. METHODS: In this study, 193 non-CAD controls and 2001 newly-diagnosed CAD patients (1647 CAD patients who received lipid-lowering therapy and 354 who did not) were recruited. Clinical data and the result of routine blood lipids tests were collected. Moreover, low-density lipoprotein cholesterol (LDL-C) subfractions (LDLC-1 to LDLC-7) were classified and quantified using the Lipoprint system. Six predictive models (k-nearest neighbor classifier (KNN), logistic regression (LR), support vector machine (SVM), decision tree (DT), multilayer perceptron (MLP), and extreme gradient boosting (XGBoost)) were established and evaluated by the confusion matrix, area under the receiver operating characteristic (ROC) curve (AUC), recall (sensitivity), accuracy, precision, and F1 score. The selected features were analyzed and ranked. RESULTS: While predicting the CAD development risk of the CAD patients without lipid-lowering therapy in the test set, all models obtained AUC values above 0.94, and the accuracy, precision, recall, and F1 score were above 0.84, 0.85, 0.92, and 0.88, respectively. While predicting the CAD development risk of all CAD patients in the test set, all models obtained AUC values above 0.91, and the accuracy, precision, recall, and F1 score were above 0.87, 0.94, 0.87, and 0.92, respectively. Importantly, small dense LDL-C (sdLDL-C) and LDLC-4 play pivotal roles in predicting CAD risk. CONCLUSIONS: In the present study, machine learning tools combining both clinical data and blood lipid profile showed excellent overall predictive power. It suggests that machine learning tools are suitable for predicting the risk of CAD development in the near future. Hindawi 2022-11-17 /pmc/articles/PMC9691305/ /pubmed/36438901 http://dx.doi.org/10.1155/2022/6030254 Text en Copyright © 2022 Tiexu Zhang et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Zhang, Tiexu Huang, Shengming Xie, Pengfei Li, Xiaoming Pan, Yingxia Xu, Yue Han, Peng Ding, Feifei Zhao, Jiangman Tang, Hui Development of Machine Learning Tools for Predicting Coronary Artery Disease in the Chinese Population |
title | Development of Machine Learning Tools for Predicting Coronary Artery Disease in the Chinese Population |
title_full | Development of Machine Learning Tools for Predicting Coronary Artery Disease in the Chinese Population |
title_fullStr | Development of Machine Learning Tools for Predicting Coronary Artery Disease in the Chinese Population |
title_full_unstemmed | Development of Machine Learning Tools for Predicting Coronary Artery Disease in the Chinese Population |
title_short | Development of Machine Learning Tools for Predicting Coronary Artery Disease in the Chinese Population |
title_sort | development of machine learning tools for predicting coronary artery disease in the chinese population |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9691305/ https://www.ncbi.nlm.nih.gov/pubmed/36438901 http://dx.doi.org/10.1155/2022/6030254 |
work_keys_str_mv | AT zhangtiexu developmentofmachinelearningtoolsforpredictingcoronaryarterydiseaseinthechinesepopulation AT huangshengming developmentofmachinelearningtoolsforpredictingcoronaryarterydiseaseinthechinesepopulation AT xiepengfei developmentofmachinelearningtoolsforpredictingcoronaryarterydiseaseinthechinesepopulation AT lixiaoming developmentofmachinelearningtoolsforpredictingcoronaryarterydiseaseinthechinesepopulation AT panyingxia developmentofmachinelearningtoolsforpredictingcoronaryarterydiseaseinthechinesepopulation AT xuyue developmentofmachinelearningtoolsforpredictingcoronaryarterydiseaseinthechinesepopulation AT hanpeng developmentofmachinelearningtoolsforpredictingcoronaryarterydiseaseinthechinesepopulation AT dingfeifei developmentofmachinelearningtoolsforpredictingcoronaryarterydiseaseinthechinesepopulation AT zhaojiangman developmentofmachinelearningtoolsforpredictingcoronaryarterydiseaseinthechinesepopulation AT tanghui developmentofmachinelearningtoolsforpredictingcoronaryarterydiseaseinthechinesepopulation |