Cargando…
The Prediction Models for High-Risk Population of Stroke Based on Logistic Regressive Analysis and Lightgbm Algorithm Separately
BACKGROUND: We aimed to investigate the high-risk factors of stroke through logistic regressive analysis and using LightGBM algorithm separately. The results of the two models were compared for instructing the prevention of stroke. METHODS: Samples of residents older than 40 years of age were collec...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Tehran University of Medical Sciences
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9643232/ https://www.ncbi.nlm.nih.gov/pubmed/36407747 http://dx.doi.org/10.18502/ijph.v51i5.9415 |
_version_ | 1784826476180275200 |
---|---|
author | Xue, Yicheng Chen, Silong Zhang, Mengmeng Cai, Xiaojuan Zheng, Jialian Wang, Shihua Chen, Yan |
author_facet | Xue, Yicheng Chen, Silong Zhang, Mengmeng Cai, Xiaojuan Zheng, Jialian Wang, Shihua Chen, Yan |
author_sort | Xue, Yicheng |
collection | PubMed |
description | BACKGROUND: We aimed to investigate the high-risk factors of stroke through logistic regressive analysis and using LightGBM algorithm separately. The results of the two models were compared for instructing the prevention of stroke. METHODS: Samples of residents older than 40 years of age were collected from two medical examination centers in Jiaxing, China from 2018 to 2019. Among the total 2124 subjects, 1059 subjects were middle-aged people (40–59 years old) and 1065 subjects were elder-aged people (≥60 years old). Their demographic characteristics, medical history, family history, eating habits etc. were recorded and separately input into logistic regressive analysis and LightGBM algorithm to build the prediction models of high-risk population of stroke. Four values including F1 score, accuracy, recall rate and AUROC were compared between the two models. RESULTS: The risk factors of stroke were positively correlated with age, while negatively correlated with the frequency of fruit consumption and taste preference. People with low-salt diet were associated with less risk of stroke than those with high-salt diet, and male had higher stroke risk than female. Meanwhile, the risk factors were positively correlated with the frequency of alcohol consumption in the middle-aged group, and negatively correlated with the education level in the elder-aged group. Furthermore, the four values from LightGBM were higher than those from logistic regression, except for the recall value of the middle-aged group. CONCLUSION: Age, gender, family history of hypertension and diabetes, the frequency of fruit consumption, alcohol and dairy products, taste preference, and education level could as the risk predictive factors of stroke. The Model of using LightGBM algorithm is more accurate than that using logistic regressive analysis. |
format | Online Article Text |
id | pubmed-9643232 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Tehran University of Medical Sciences |
record_format | MEDLINE/PubMed |
spelling | pubmed-96432322022-11-18 The Prediction Models for High-Risk Population of Stroke Based on Logistic Regressive Analysis and Lightgbm Algorithm Separately Xue, Yicheng Chen, Silong Zhang, Mengmeng Cai, Xiaojuan Zheng, Jialian Wang, Shihua Chen, Yan Iran J Public Health Original Article BACKGROUND: We aimed to investigate the high-risk factors of stroke through logistic regressive analysis and using LightGBM algorithm separately. The results of the two models were compared for instructing the prevention of stroke. METHODS: Samples of residents older than 40 years of age were collected from two medical examination centers in Jiaxing, China from 2018 to 2019. Among the total 2124 subjects, 1059 subjects were middle-aged people (40–59 years old) and 1065 subjects were elder-aged people (≥60 years old). Their demographic characteristics, medical history, family history, eating habits etc. were recorded and separately input into logistic regressive analysis and LightGBM algorithm to build the prediction models of high-risk population of stroke. Four values including F1 score, accuracy, recall rate and AUROC were compared between the two models. RESULTS: The risk factors of stroke were positively correlated with age, while negatively correlated with the frequency of fruit consumption and taste preference. People with low-salt diet were associated with less risk of stroke than those with high-salt diet, and male had higher stroke risk than female. Meanwhile, the risk factors were positively correlated with the frequency of alcohol consumption in the middle-aged group, and negatively correlated with the education level in the elder-aged group. Furthermore, the four values from LightGBM were higher than those from logistic regression, except for the recall value of the middle-aged group. CONCLUSION: Age, gender, family history of hypertension and diabetes, the frequency of fruit consumption, alcohol and dairy products, taste preference, and education level could as the risk predictive factors of stroke. The Model of using LightGBM algorithm is more accurate than that using logistic regressive analysis. Tehran University of Medical Sciences 2022-05 /pmc/articles/PMC9643232/ /pubmed/36407747 http://dx.doi.org/10.18502/ijph.v51i5.9415 Text en Copyright © 2022 Xue et al. Published by Tehran University of Medical Sciences https://creativecommons.org/licenses/by-nc/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International license (https://creativecommons.org/licenses/by-nc/4.0/). Non-commercial uses of the work are permitted, provided the original work is properly cited. |
spellingShingle | Original Article Xue, Yicheng Chen, Silong Zhang, Mengmeng Cai, Xiaojuan Zheng, Jialian Wang, Shihua Chen, Yan The Prediction Models for High-Risk Population of Stroke Based on Logistic Regressive Analysis and Lightgbm Algorithm Separately |
title | The Prediction Models for High-Risk Population of Stroke Based on Logistic Regressive Analysis and Lightgbm Algorithm Separately |
title_full | The Prediction Models for High-Risk Population of Stroke Based on Logistic Regressive Analysis and Lightgbm Algorithm Separately |
title_fullStr | The Prediction Models for High-Risk Population of Stroke Based on Logistic Regressive Analysis and Lightgbm Algorithm Separately |
title_full_unstemmed | The Prediction Models for High-Risk Population of Stroke Based on Logistic Regressive Analysis and Lightgbm Algorithm Separately |
title_short | The Prediction Models for High-Risk Population of Stroke Based on Logistic Regressive Analysis and Lightgbm Algorithm Separately |
title_sort | prediction models for high-risk population of stroke based on logistic regressive analysis and lightgbm algorithm separately |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9643232/ https://www.ncbi.nlm.nih.gov/pubmed/36407747 http://dx.doi.org/10.18502/ijph.v51i5.9415 |
work_keys_str_mv | AT xueyicheng thepredictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately AT chensilong thepredictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately AT zhangmengmeng thepredictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately AT caixiaojuan thepredictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately AT zhengjialian thepredictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately AT wangshihua thepredictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately AT chenyan thepredictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately AT xueyicheng predictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately AT chensilong predictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately AT zhangmengmeng predictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately AT caixiaojuan predictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately AT zhengjialian predictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately AT wangshihua predictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately AT chenyan predictionmodelsforhighriskpopulationofstrokebasedonlogisticregressiveanalysisandlightgbmalgorithmseparately |