Cargando…

Comparison Between Statistical Model and Machine Learning Methods for Predicting the Risk of Renal Function Decline Using Routine Clinical Data in Health Screening

PURPOSE: Using machine learning method to predict and judge unknown data offers opportunity to improve accuracy by exploring complex interactions between risk factors. Therefore, we evaluate the performance of machine learning (ML) algorithms and to compare them with logistic regression for predicti...

Descripción completa

Detalles Bibliográficos
Autores principales:	Cao, Xia, Lin, Yanhui, Yang, Binfang, Li, Ying, Zhou, Jiansong
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Dove 2022
Materias:	Original Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9056070/ https://www.ncbi.nlm.nih.gov/pubmed/35502445 http://dx.doi.org/10.2147/RMHP.S346856

_version_	1784697552065527808
author	Cao, Xia Lin, Yanhui Yang, Binfang Li, Ying Zhou, Jiansong
author_facet	Cao, Xia Lin, Yanhui Yang, Binfang Li, Ying Zhou, Jiansong
author_sort	Cao, Xia
collection	PubMed
description	PURPOSE: Using machine learning method to predict and judge unknown data offers opportunity to improve accuracy by exploring complex interactions between risk factors. Therefore, we evaluate the performance of machine learning (ML) algorithms and to compare them with logistic regression for predicting the risk of renal function decline (RFD) using routine clinical data. PATIENTS AND METHODS: This retrospective cohort study includes datasets from 2166 subjects, aged 35–74 years old, provided by an adult health screening follow-up program between 2010 and 2020. Seven different ML models were considered – random forest, gradient boosting, multilayer perceptron, support vector machine, K-nearest neighbors, adaptive boosting, and decision tree - and were compared with standard logistic regression. There were 24 independent variables, and the baseline estimate glomerular filtration rate (eGFR) was used as the predictive variable. RESULTS: A total of 2166 participants (mean age 49.2±11.2 years old, 63.3% males) were enrolled and randomly divided into a training set (n=1732) and a test set (n=434). The area under receiver operating characteristic curve (AUROC) for detecting RFD corresponding to the different models were above 0.85 during the training phase. The gradient boosting algorithms exhibited the best average prediction accuracy (AUROC: 0.914) among all algorithms validated in this study. Based on AUROC, the ML algorithms improved the RFD prediction performance, compared to logistic regression model (AUROC:0.882), except the K-nearest neighbors and decision tree algorithms (AUROC:0.854 and 0.824, respectively). However, the improvement differences with logistic regression were small (less than 4%) and nonsignificant. CONCLUSION: Our results indicate that the proposed health screening dataset-based RFD prediction model using ML algorithms is readily applicable, produces validated results. But logistic regression yields as good performance as ML models to predict the risk of RFD with simple clinical predictors.
format	Online Article Text
id	pubmed-9056070
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Dove
record_format	MEDLINE/PubMed
spelling	pubmed-90560702022-05-01 Comparison Between Statistical Model and Machine Learning Methods for Predicting the Risk of Renal Function Decline Using Routine Clinical Data in Health Screening Cao, Xia Lin, Yanhui Yang, Binfang Li, Ying Zhou, Jiansong Risk Manag Healthc Policy Original Research PURPOSE: Using machine learning method to predict and judge unknown data offers opportunity to improve accuracy by exploring complex interactions between risk factors. Therefore, we evaluate the performance of machine learning (ML) algorithms and to compare them with logistic regression for predicting the risk of renal function decline (RFD) using routine clinical data. PATIENTS AND METHODS: This retrospective cohort study includes datasets from 2166 subjects, aged 35–74 years old, provided by an adult health screening follow-up program between 2010 and 2020. Seven different ML models were considered – random forest, gradient boosting, multilayer perceptron, support vector machine, K-nearest neighbors, adaptive boosting, and decision tree - and were compared with standard logistic regression. There were 24 independent variables, and the baseline estimate glomerular filtration rate (eGFR) was used as the predictive variable. RESULTS: A total of 2166 participants (mean age 49.2±11.2 years old, 63.3% males) were enrolled and randomly divided into a training set (n=1732) and a test set (n=434). The area under receiver operating characteristic curve (AUROC) for detecting RFD corresponding to the different models were above 0.85 during the training phase. The gradient boosting algorithms exhibited the best average prediction accuracy (AUROC: 0.914) among all algorithms validated in this study. Based on AUROC, the ML algorithms improved the RFD prediction performance, compared to logistic regression model (AUROC:0.882), except the K-nearest neighbors and decision tree algorithms (AUROC:0.854 and 0.824, respectively). However, the improvement differences with logistic regression were small (less than 4%) and nonsignificant. CONCLUSION: Our results indicate that the proposed health screening dataset-based RFD prediction model using ML algorithms is readily applicable, produces validated results. But logistic regression yields as good performance as ML models to predict the risk of RFD with simple clinical predictors. Dove 2022-04-26 /pmc/articles/PMC9056070/ /pubmed/35502445 http://dx.doi.org/10.2147/RMHP.S346856 Text en © 2022 Cao et al. https://creativecommons.org/licenses/by-nc/3.0/This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution – Non Commercial (unported, v3.0) License (http://creativecommons.org/licenses/by-nc/3.0/ (https://creativecommons.org/licenses/by-nc/3.0/) ). By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms (https://www.dovepress.com/terms.php).
spellingShingle	Original Research Cao, Xia Lin, Yanhui Yang, Binfang Li, Ying Zhou, Jiansong Comparison Between Statistical Model and Machine Learning Methods for Predicting the Risk of Renal Function Decline Using Routine Clinical Data in Health Screening
title	Comparison Between Statistical Model and Machine Learning Methods for Predicting the Risk of Renal Function Decline Using Routine Clinical Data in Health Screening
title_full	Comparison Between Statistical Model and Machine Learning Methods for Predicting the Risk of Renal Function Decline Using Routine Clinical Data in Health Screening
title_fullStr	Comparison Between Statistical Model and Machine Learning Methods for Predicting the Risk of Renal Function Decline Using Routine Clinical Data in Health Screening
title_full_unstemmed	Comparison Between Statistical Model and Machine Learning Methods for Predicting the Risk of Renal Function Decline Using Routine Clinical Data in Health Screening
title_short	Comparison Between Statistical Model and Machine Learning Methods for Predicting the Risk of Renal Function Decline Using Routine Clinical Data in Health Screening
title_sort	comparison between statistical model and machine learning methods for predicting the risk of renal function decline using routine clinical data in health screening
topic	Original Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9056070/ https://www.ncbi.nlm.nih.gov/pubmed/35502445 http://dx.doi.org/10.2147/RMHP.S346856
work_keys_str_mv	AT caoxia comparisonbetweenstatisticalmodelandmachinelearningmethodsforpredictingtheriskofrenalfunctiondeclineusingroutineclinicaldatainhealthscreening AT linyanhui comparisonbetweenstatisticalmodelandmachinelearningmethodsforpredictingtheriskofrenalfunctiondeclineusingroutineclinicaldatainhealthscreening AT yangbinfang comparisonbetweenstatisticalmodelandmachinelearningmethodsforpredictingtheriskofrenalfunctiondeclineusingroutineclinicaldatainhealthscreening AT liying comparisonbetweenstatisticalmodelandmachinelearningmethodsforpredictingtheriskofrenalfunctiondeclineusingroutineclinicaldatainhealthscreening AT zhoujiansong comparisonbetweenstatisticalmodelandmachinelearningmethodsforpredictingtheriskofrenalfunctiondeclineusingroutineclinicaldatainhealthscreening

Comparison Between Statistical Model and Machine Learning Methods for Predicting the Risk of Renal Function Decline Using Routine Clinical Data in Health Screening

Ejemplares similares