Cargando…

Combinatorial Use of Machine Learning and Logistic Regression for Predicting Carotid Plaque Risk Among 5.4 Million Adults With Fatty Liver Disease Receiving Health Check-Ups: Population-Based Cross-Sectional Study

BACKGROUND: Carotid plaque can progress into stroke, myocardial infarction, etc, which are major global causes of death. Evidence shows a significant increase in carotid plaque incidence among patients with fatty liver disease. However, unlike the high detection rate of fatty liver disease, screenin...

Descripción completa

Detalles Bibliográficos
Autores principales: Deng, Yuhan, Ma, Yuan, Fu, Jingzhu, Wang, Xiaona, Yu, Canqing, Lv, Jun, Man, Sailimai, Wang, Bo, Li, Liming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10514774/
https://www.ncbi.nlm.nih.gov/pubmed/37676713
http://dx.doi.org/10.2196/47095
_version_ 1785108797726916608
author Deng, Yuhan
Ma, Yuan
Fu, Jingzhu
Wang, Xiaona
Yu, Canqing
Lv, Jun
Man, Sailimai
Wang, Bo
Li, Liming
author_facet Deng, Yuhan
Ma, Yuan
Fu, Jingzhu
Wang, Xiaona
Yu, Canqing
Lv, Jun
Man, Sailimai
Wang, Bo
Li, Liming
author_sort Deng, Yuhan
collection PubMed
description BACKGROUND: Carotid plaque can progress into stroke, myocardial infarction, etc, which are major global causes of death. Evidence shows a significant increase in carotid plaque incidence among patients with fatty liver disease. However, unlike the high detection rate of fatty liver disease, screening for carotid plaque in the asymptomatic population is not yet prevalent due to cost-effectiveness reasons, resulting in a large number of patients with undetected carotid plaques, especially among those with fatty liver disease. OBJECTIVE: This study aimed to combine the advantages of machine learning (ML) and logistic regression to develop a straightforward prediction model among the population with fatty liver disease to identify individuals at risk of carotid plaque. METHODS: Our study included 5,420,640 participants with fatty liver from Meinian Health Care Center. We used random forest, elastic net (EN), and extreme gradient boosting ML algorithms to select important features from potential predictors. Features acknowledged by all 3 models were enrolled in logistic regression analysis to develop a carotid plaque prediction model. Model performance was evaluated based on the area under the receiver operating characteristic curve, calibration curve, Brier score, and decision curve analysis both in a randomly split internal validation data set, and an external validation data set comprising 32,682 participants from MJ Health Check-up Center. Risk cutoff points for carotid plaque were determined based on the Youden index, predicted probability distribution, and prevalence rate of the internal validation data set to classify participants into high-, intermediate-, and low-risk groups. This risk classification was further validated in the external validation data set. RESULTS: Among the participants, 26.23% (1,421,970/5,420,640) were diagnosed with carotid plaque in the development data set, and 21.64% (7074/32,682) were diagnosed in the external validation data set. A total of 6 features, including age, systolic blood pressure, low-density lipoprotein cholesterol (LDL-C), total cholesterol, fasting blood glucose, and hepatic steatosis index (HSI) were collectively selected by all 3 ML models out of 27 predictors. After eliminating the issue of collinearity between features, the logistic regression model established with the 5 independent predictors reached an area under the curve of 0.831 in the internal validation data set and 0.801 in the external validation data set, and showed good calibration capability graphically. Its predictive performance was comprehensively competitive compared with the single use of either logistic regression or ML algorithms. Optimal predicted probability cutoff points of 25% and 65% were determined for classifying individuals into low-, intermediate-, and high-risk categories for carotid plaque. CONCLUSIONS: The combination of ML and logistic regression yielded a practical carotid plaque prediction model, and was of great public health implications in the early identification and risk assessment of carotid plaque among individuals with fatty liver.
format Online
Article
Text
id pubmed-10514774
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-105147742023-09-23 Combinatorial Use of Machine Learning and Logistic Regression for Predicting Carotid Plaque Risk Among 5.4 Million Adults With Fatty Liver Disease Receiving Health Check-Ups: Population-Based Cross-Sectional Study Deng, Yuhan Ma, Yuan Fu, Jingzhu Wang, Xiaona Yu, Canqing Lv, Jun Man, Sailimai Wang, Bo Li, Liming JMIR Public Health Surveill Original Paper BACKGROUND: Carotid plaque can progress into stroke, myocardial infarction, etc, which are major global causes of death. Evidence shows a significant increase in carotid plaque incidence among patients with fatty liver disease. However, unlike the high detection rate of fatty liver disease, screening for carotid plaque in the asymptomatic population is not yet prevalent due to cost-effectiveness reasons, resulting in a large number of patients with undetected carotid plaques, especially among those with fatty liver disease. OBJECTIVE: This study aimed to combine the advantages of machine learning (ML) and logistic regression to develop a straightforward prediction model among the population with fatty liver disease to identify individuals at risk of carotid plaque. METHODS: Our study included 5,420,640 participants with fatty liver from Meinian Health Care Center. We used random forest, elastic net (EN), and extreme gradient boosting ML algorithms to select important features from potential predictors. Features acknowledged by all 3 models were enrolled in logistic regression analysis to develop a carotid plaque prediction model. Model performance was evaluated based on the area under the receiver operating characteristic curve, calibration curve, Brier score, and decision curve analysis both in a randomly split internal validation data set, and an external validation data set comprising 32,682 participants from MJ Health Check-up Center. Risk cutoff points for carotid plaque were determined based on the Youden index, predicted probability distribution, and prevalence rate of the internal validation data set to classify participants into high-, intermediate-, and low-risk groups. This risk classification was further validated in the external validation data set. RESULTS: Among the participants, 26.23% (1,421,970/5,420,640) were diagnosed with carotid plaque in the development data set, and 21.64% (7074/32,682) were diagnosed in the external validation data set. A total of 6 features, including age, systolic blood pressure, low-density lipoprotein cholesterol (LDL-C), total cholesterol, fasting blood glucose, and hepatic steatosis index (HSI) were collectively selected by all 3 ML models out of 27 predictors. After eliminating the issue of collinearity between features, the logistic regression model established with the 5 independent predictors reached an area under the curve of 0.831 in the internal validation data set and 0.801 in the external validation data set, and showed good calibration capability graphically. Its predictive performance was comprehensively competitive compared with the single use of either logistic regression or ML algorithms. Optimal predicted probability cutoff points of 25% and 65% were determined for classifying individuals into low-, intermediate-, and high-risk categories for carotid plaque. CONCLUSIONS: The combination of ML and logistic regression yielded a practical carotid plaque prediction model, and was of great public health implications in the early identification and risk assessment of carotid plaque among individuals with fatty liver. JMIR Publications 2023-09-07 /pmc/articles/PMC10514774/ /pubmed/37676713 http://dx.doi.org/10.2196/47095 Text en ©Yuhan Deng, Yuan Ma, Jingzhu Fu, Xiaona Wang, Canqing Yu, Jun Lv, Sailimai Man, Bo Wang, Liming Li. Originally published in JMIR Public Health and Surveillance (https://publichealth.jmir.org), 07.09.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on https://publichealth.jmir.org, as well as this copyright and license information must be included.
spellingShingle Original Paper
Deng, Yuhan
Ma, Yuan
Fu, Jingzhu
Wang, Xiaona
Yu, Canqing
Lv, Jun
Man, Sailimai
Wang, Bo
Li, Liming
Combinatorial Use of Machine Learning and Logistic Regression for Predicting Carotid Plaque Risk Among 5.4 Million Adults With Fatty Liver Disease Receiving Health Check-Ups: Population-Based Cross-Sectional Study
title Combinatorial Use of Machine Learning and Logistic Regression for Predicting Carotid Plaque Risk Among 5.4 Million Adults With Fatty Liver Disease Receiving Health Check-Ups: Population-Based Cross-Sectional Study
title_full Combinatorial Use of Machine Learning and Logistic Regression for Predicting Carotid Plaque Risk Among 5.4 Million Adults With Fatty Liver Disease Receiving Health Check-Ups: Population-Based Cross-Sectional Study
title_fullStr Combinatorial Use of Machine Learning and Logistic Regression for Predicting Carotid Plaque Risk Among 5.4 Million Adults With Fatty Liver Disease Receiving Health Check-Ups: Population-Based Cross-Sectional Study
title_full_unstemmed Combinatorial Use of Machine Learning and Logistic Regression for Predicting Carotid Plaque Risk Among 5.4 Million Adults With Fatty Liver Disease Receiving Health Check-Ups: Population-Based Cross-Sectional Study
title_short Combinatorial Use of Machine Learning and Logistic Regression for Predicting Carotid Plaque Risk Among 5.4 Million Adults With Fatty Liver Disease Receiving Health Check-Ups: Population-Based Cross-Sectional Study
title_sort combinatorial use of machine learning and logistic regression for predicting carotid plaque risk among 5.4 million adults with fatty liver disease receiving health check-ups: population-based cross-sectional study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10514774/
https://www.ncbi.nlm.nih.gov/pubmed/37676713
http://dx.doi.org/10.2196/47095
work_keys_str_mv AT dengyuhan combinatorialuseofmachinelearningandlogisticregressionforpredictingcarotidplaqueriskamong54millionadultswithfattyliverdiseasereceivinghealthcheckupspopulationbasedcrosssectionalstudy
AT mayuan combinatorialuseofmachinelearningandlogisticregressionforpredictingcarotidplaqueriskamong54millionadultswithfattyliverdiseasereceivinghealthcheckupspopulationbasedcrosssectionalstudy
AT fujingzhu combinatorialuseofmachinelearningandlogisticregressionforpredictingcarotidplaqueriskamong54millionadultswithfattyliverdiseasereceivinghealthcheckupspopulationbasedcrosssectionalstudy
AT wangxiaona combinatorialuseofmachinelearningandlogisticregressionforpredictingcarotidplaqueriskamong54millionadultswithfattyliverdiseasereceivinghealthcheckupspopulationbasedcrosssectionalstudy
AT yucanqing combinatorialuseofmachinelearningandlogisticregressionforpredictingcarotidplaqueriskamong54millionadultswithfattyliverdiseasereceivinghealthcheckupspopulationbasedcrosssectionalstudy
AT lvjun combinatorialuseofmachinelearningandlogisticregressionforpredictingcarotidplaqueriskamong54millionadultswithfattyliverdiseasereceivinghealthcheckupspopulationbasedcrosssectionalstudy
AT mansailimai combinatorialuseofmachinelearningandlogisticregressionforpredictingcarotidplaqueriskamong54millionadultswithfattyliverdiseasereceivinghealthcheckupspopulationbasedcrosssectionalstudy
AT wangbo combinatorialuseofmachinelearningandlogisticregressionforpredictingcarotidplaqueriskamong54millionadultswithfattyliverdiseasereceivinghealthcheckupspopulationbasedcrosssectionalstudy
AT liliming combinatorialuseofmachinelearningandlogisticregressionforpredictingcarotidplaqueriskamong54millionadultswithfattyliverdiseasereceivinghealthcheckupspopulationbasedcrosssectionalstudy