Cargando…

Study of cardiovascular disease prediction model based on random forest in eastern China

Cardiovascular disease (CVD) is the leading cause of death worldwide and a major public health concern. CVD prediction is one of the most effective measures for CVD control. In this study, 29930 subjects with high-risk of CVD were selected from 101056 people in 2014, regular follow-up was conducted...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Li, Wu, Haibin, Jin, Xiaoqing, Zheng, Pinpin, Hu, Shiyun, Xu, Xiaoling, Yu, Wei, Yan, Jing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7090086/
https://www.ncbi.nlm.nih.gov/pubmed/32251324
http://dx.doi.org/10.1038/s41598-020-62133-5
_version_ 1783509858982559744
author Yang, Li
Wu, Haibin
Jin, Xiaoqing
Zheng, Pinpin
Hu, Shiyun
Xu, Xiaoling
Yu, Wei
Yan, Jing
author_facet Yang, Li
Wu, Haibin
Jin, Xiaoqing
Zheng, Pinpin
Hu, Shiyun
Xu, Xiaoling
Yu, Wei
Yan, Jing
author_sort Yang, Li
collection PubMed
description Cardiovascular disease (CVD) is the leading cause of death worldwide and a major public health concern. CVD prediction is one of the most effective measures for CVD control. In this study, 29930 subjects with high-risk of CVD were selected from 101056 people in 2014, regular follow-up was conducted using electronic health record system. Logistic regression analysis showed that nearly 30 indicators were related to CVD, including male, old age, family income, smoking, drinking, obesity, excessive waist circumference, abnormal cholesterol, abnormal low-density lipoprotein, abnormal fasting blood glucose and else. Several methods were used to build prediction model including multivariate regression model, classification and regression tree (CART), Naïve Bayes, Bagged trees, Ada Boost and Random Forest. We used the multivariate regression model as a benchmark for performance evaluation (Area under the curve, AUC = 0.7143). The results showed that the Random Forest was superior to other methods with an AUC of 0.787 and achieved a significant improvement over the benchmark. We provided a CVD prediction model for 3-year risk assessment of CVD. It was based on a large population with high risk of CVD in eastern China using Random Forest algorithm, which would provide reference for the work of CVD prediction and treatment in China.
format Online
Article
Text
id pubmed-7090086
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-70900862020-03-27 Study of cardiovascular disease prediction model based on random forest in eastern China Yang, Li Wu, Haibin Jin, Xiaoqing Zheng, Pinpin Hu, Shiyun Xu, Xiaoling Yu, Wei Yan, Jing Sci Rep Article Cardiovascular disease (CVD) is the leading cause of death worldwide and a major public health concern. CVD prediction is one of the most effective measures for CVD control. In this study, 29930 subjects with high-risk of CVD were selected from 101056 people in 2014, regular follow-up was conducted using electronic health record system. Logistic regression analysis showed that nearly 30 indicators were related to CVD, including male, old age, family income, smoking, drinking, obesity, excessive waist circumference, abnormal cholesterol, abnormal low-density lipoprotein, abnormal fasting blood glucose and else. Several methods were used to build prediction model including multivariate regression model, classification and regression tree (CART), Naïve Bayes, Bagged trees, Ada Boost and Random Forest. We used the multivariate regression model as a benchmark for performance evaluation (Area under the curve, AUC = 0.7143). The results showed that the Random Forest was superior to other methods with an AUC of 0.787 and achieved a significant improvement over the benchmark. We provided a CVD prediction model for 3-year risk assessment of CVD. It was based on a large population with high risk of CVD in eastern China using Random Forest algorithm, which would provide reference for the work of CVD prediction and treatment in China. Nature Publishing Group UK 2020-03-23 /pmc/articles/PMC7090086/ /pubmed/32251324 http://dx.doi.org/10.1038/s41598-020-62133-5 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Yang, Li
Wu, Haibin
Jin, Xiaoqing
Zheng, Pinpin
Hu, Shiyun
Xu, Xiaoling
Yu, Wei
Yan, Jing
Study of cardiovascular disease prediction model based on random forest in eastern China
title Study of cardiovascular disease prediction model based on random forest in eastern China
title_full Study of cardiovascular disease prediction model based on random forest in eastern China
title_fullStr Study of cardiovascular disease prediction model based on random forest in eastern China
title_full_unstemmed Study of cardiovascular disease prediction model based on random forest in eastern China
title_short Study of cardiovascular disease prediction model based on random forest in eastern China
title_sort study of cardiovascular disease prediction model based on random forest in eastern china
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7090086/
https://www.ncbi.nlm.nih.gov/pubmed/32251324
http://dx.doi.org/10.1038/s41598-020-62133-5
work_keys_str_mv AT yangli studyofcardiovasculardiseasepredictionmodelbasedonrandomforestineasternchina
AT wuhaibin studyofcardiovasculardiseasepredictionmodelbasedonrandomforestineasternchina
AT jinxiaoqing studyofcardiovasculardiseasepredictionmodelbasedonrandomforestineasternchina
AT zhengpinpin studyofcardiovasculardiseasepredictionmodelbasedonrandomforestineasternchina
AT hushiyun studyofcardiovasculardiseasepredictionmodelbasedonrandomforestineasternchina
AT xuxiaoling studyofcardiovasculardiseasepredictionmodelbasedonrandomforestineasternchina
AT yuwei studyofcardiovasculardiseasepredictionmodelbasedonrandomforestineasternchina
AT yanjing studyofcardiovasculardiseasepredictionmodelbasedonrandomforestineasternchina