Cargando…

The Prediction Models for High-Risk Population of Stroke Based on Logistic Regressive Analysis and Lightgbm Algorithm Separately

BACKGROUND: We aimed to investigate the high-risk factors of stroke through logistic regressive analysis and using LightGBM algorithm separately. The results of the two models were compared for instructing the prevention of stroke. METHODS: Samples of residents older than 40 years of age were collec...

Descripción completa

Detalles Bibliográficos
Autores principales: Xue, Yicheng, Chen, Silong, Zhang, Mengmeng, Cai, Xiaojuan, Zheng, Jialian, Wang, Shihua, Chen, Yan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Tehran University of Medical Sciences 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9643232/
https://www.ncbi.nlm.nih.gov/pubmed/36407747
http://dx.doi.org/10.18502/ijph.v51i5.9415
_version_ 1784826476180275200
author Xue, Yicheng
Chen, Silong
Zhang, Mengmeng
Cai, Xiaojuan
Zheng, Jialian
Wang, Shihua
Chen, Yan
author_facet Xue, Yicheng
Chen, Silong
Zhang, Mengmeng
Cai, Xiaojuan
Zheng, Jialian
Wang, Shihua
Chen, Yan
author_sort Xue, Yicheng
collection PubMed
description BACKGROUND: We aimed to investigate the high-risk factors of stroke through logistic regressive analysis and using LightGBM algorithm separately. The results of the two models were compared for instructing the prevention of stroke. METHODS: Samples of residents older than 40 years of age were collected from two medical examination centers in Jiaxing, China from 2018 to 2019. Among the total 2124 subjects, 1059 subjects were middle-aged people (40–59 years old) and 1065 subjects were elder-aged people (≥60 years old). Their demographic characteristics, medical history, family history, eating habits etc. were recorded and separately input into logistic regressive analysis and LightGBM algorithm to build the prediction models of high-risk population of stroke. Four values including F1 score, accuracy, recall rate and AUROC were compared between the two models. RESULTS: The risk factors of stroke were positively correlated with age, while negatively correlated with the frequency of fruit consumption and taste preference. People with low-salt diet were associated with less risk of stroke than those with high-salt diet, and male had higher stroke risk than female. Meanwhile, the risk factors were positively correlated with the frequency of alcohol consumption in the middle-aged group, and negatively correlated with the education level in the elder-aged group. Furthermore, the four values from LightGBM were higher than those from logistic regression, except for the recall value of the middle-aged group. CONCLUSION: Age, gender, family history of hypertension and diabetes, the frequency of fruit consumption, alcohol and dairy products, taste preference, and education level could as the risk predictive factors of stroke. The Model of using LightGBM algorithm is more accurate than that using logistic regressive analysis.
format Online
Article
Text
id pubmed-9643232
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Tehran University of Medical Sciences
record_format MEDLINE/PubMed
spelling pubmed-96432322022-11-18 The Prediction Models for High-Risk Population of Stroke Based on Logistic Regressive Analysis and Lightgbm Algorithm Separately Xue, Yicheng Chen, Silong Zhang, Mengmeng Cai, Xiaojuan Zheng, Jialian Wang, Shihua Chen, Yan Iran J Public Health Original Article BACKGROUND: We aimed to investigate the high-risk factors of stroke through logistic regressive analysis and using LightGBM algorithm separately. The results of the two models were compared for instructing the prevention of stroke. METHODS: Samples of residents older than 40 years of age were collected from two medical examination centers in Jiaxing, China from 2018 to 2019. Among the total 2124 subjects, 1059 subjects were middle-aged people (40–59 years old) and 1065 subjects were elder-aged people (≥60 years old). Their demographic characteristics, medical history, family history, eating habits etc. were recorded and separately input into logistic regressive analysis and LightGBM algorithm to build the prediction models of high-risk population of stroke. Four values including F1 score, accuracy, recall rate and AUROC were compared between the two models. RESULTS: The risk factors of stroke were positively correlated with age, while negatively correlated with the frequency of fruit consumption and taste preference. People with low-salt diet were associated with less risk of stroke than those with high-salt diet, and male had higher stroke risk than female. Meanwhile, the risk factors were positively correlated with the frequency of alcohol consumption in the middle-aged group, and negatively correlated with the education level in the elder-aged group. Furthermore, the four values from LightGBM were higher than those from logistic regression, except for the recall value of the middle-aged group. CONCLUSION: Age, gender, family history of hypertension and diabetes, the frequency of fruit consumption, alcohol and dairy products, taste preference, and education level could as the risk predictive factors of stroke. The Model of using LightGBM algorithm is more accurate than that using logistic regressive analysis. Tehran University of Medical Sciences 2022-05 /pmc/articles/PMC9643232/ /pubmed/36407747 http://dx.doi.org/10.18502/ijph.v51i5.9415 Text en Copyright © 2022 Xue et al. Published by Tehran University of Medical Sciences https://creativecommons.org/licenses/by-nc/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International license (https://creativecommons.org/licenses/by-nc/4.0/). Non-commercial uses of the work are permitted, provided the original work is properly cited.
spellingShingle Original Article
Xue, Yicheng
Chen, Silong
Zhang, Mengmeng
Cai, Xiaojuan
Zheng, Jialian
Wang, Shihua
Chen, Yan
The Prediction Models for High-Risk Population of Stroke Based on Logistic Regressive Analysis and Lightgbm Algorithm Separately
title The Prediction Models for High-Risk Population of Stroke Based on Logistic Regressive Analysis and Lightgbm Algorithm Separately
title_full The Prediction Models for High-Risk Population of Stroke Based on Logistic Regressive Analysis and Lightgbm Algorithm Separately
title_fullStr The Prediction Models for High-Risk Population of Stroke Based on Logistic Regressive Analysis and Lightgbm Algorithm Separately
title_full_unstemmed The Prediction Models for High-Risk Population of Stroke Based on Logistic Regressive Analysis and Lightgbm Algorithm Separately
title_short The Prediction Models for High-Risk Population of Stroke Based on Logistic Regressive Analysis and Lightgbm Algorithm Separately
title_sort prediction models for high-risk population of stroke based on logistic regressive analysis and lightgbm algorithm separately
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9643232/
https://www.ncbi.nlm.nih.gov/pubmed/36407747
http://dx.doi.org/10.18502/ijph.v51i5.9415
work_keys_str_mv AT xueyicheng thepredictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately
AT chensilong thepredictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately
AT zhangmengmeng thepredictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately
AT caixiaojuan thepredictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately
AT zhengjialian thepredictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately
AT wangshihua thepredictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately
AT chenyan thepredictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately
AT xueyicheng predictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately
AT chensilong predictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately
AT zhangmengmeng predictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately
AT caixiaojuan predictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately
AT zhengjialian predictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately
AT wangshihua predictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately
AT chenyan predictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately