Cargando…

A comparison of machine learning algorithms and traditional regression-based statistical modeling for predicting hypertension incidence in a Canadian population

Risk prediction models are frequently used to identify individuals at risk of developing hypertension. This study evaluates different machine learning algorithms and compares their predictive performance with the conventional Cox proportional hazards (PH) model to predict hypertension incidence usin...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chowdhury, Mohammad Ziaul Islam, Leung, Alexander A., Walker, Robin L., Sikdar, Khokan C., O’Beirne, Maeve, Quan, Hude, Turin, Tanvir C.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9807553/ https://www.ncbi.nlm.nih.gov/pubmed/36593280 http://dx.doi.org/10.1038/s41598-022-27264-x

_version_	1784862745036849152
author	Chowdhury, Mohammad Ziaul Islam Leung, Alexander A. Walker, Robin L. Sikdar, Khokan C. O’Beirne, Maeve Quan, Hude Turin, Tanvir C.
author_facet	Chowdhury, Mohammad Ziaul Islam Leung, Alexander A. Walker, Robin L. Sikdar, Khokan C. O’Beirne, Maeve Quan, Hude Turin, Tanvir C.
author_sort	Chowdhury, Mohammad Ziaul Islam
collection	PubMed
description	Risk prediction models are frequently used to identify individuals at risk of developing hypertension. This study evaluates different machine learning algorithms and compares their predictive performance with the conventional Cox proportional hazards (PH) model to predict hypertension incidence using survival data. This study analyzed 18,322 participants on 24 candidate features from the large Alberta’s Tomorrow Project (ATP) to develop different prediction models. To select the top features, we applied five feature selection methods, including two filter-based: a univariate Cox p-value and C-index; two embedded-based: random survival forest and least absolute shrinkage and selection operator (Lasso); and one constraint-based: the statistically equivalent signature (SES). Five machine learning algorithms were developed to predict hypertension incidence: penalized regression Ridge, Lasso, Elastic Net (EN), random survival forest (RSF), and gradient boosting (GB), along with the conventional Cox PH model. The predictive performance of the models was assessed using C-index. The performance of machine learning algorithms was observed, similar to the conventional Cox PH model. Average C-indexes were 0.78, 0.78, 0.78, 0.76, 0.76, and 0.77 for Ridge, Lasso, EN, RSF, GB and Cox PH, respectively. Important features associated with each model were also presented. Our study findings demonstrate little predictive performance difference between machine learning algorithms and the conventional Cox PH regression model in predicting hypertension incidence. In a moderate dataset with a reasonable number of features, conventional regression-based models perform similar to machine learning algorithms with good predictive accuracy.
format	Online Article Text
id	pubmed-9807553
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-98075532023-01-04 A comparison of machine learning algorithms and traditional regression-based statistical modeling for predicting hypertension incidence in a Canadian population Chowdhury, Mohammad Ziaul Islam Leung, Alexander A. Walker, Robin L. Sikdar, Khokan C. O’Beirne, Maeve Quan, Hude Turin, Tanvir C. Sci Rep Article Risk prediction models are frequently used to identify individuals at risk of developing hypertension. This study evaluates different machine learning algorithms and compares their predictive performance with the conventional Cox proportional hazards (PH) model to predict hypertension incidence using survival data. This study analyzed 18,322 participants on 24 candidate features from the large Alberta’s Tomorrow Project (ATP) to develop different prediction models. To select the top features, we applied five feature selection methods, including two filter-based: a univariate Cox p-value and C-index; two embedded-based: random survival forest and least absolute shrinkage and selection operator (Lasso); and one constraint-based: the statistically equivalent signature (SES). Five machine learning algorithms were developed to predict hypertension incidence: penalized regression Ridge, Lasso, Elastic Net (EN), random survival forest (RSF), and gradient boosting (GB), along with the conventional Cox PH model. The predictive performance of the models was assessed using C-index. The performance of machine learning algorithms was observed, similar to the conventional Cox PH model. Average C-indexes were 0.78, 0.78, 0.78, 0.76, 0.76, and 0.77 for Ridge, Lasso, EN, RSF, GB and Cox PH, respectively. Important features associated with each model were also presented. Our study findings demonstrate little predictive performance difference between machine learning algorithms and the conventional Cox PH regression model in predicting hypertension incidence. In a moderate dataset with a reasonable number of features, conventional regression-based models perform similar to machine learning algorithms with good predictive accuracy. Nature Publishing Group UK 2023-01-02 /pmc/articles/PMC9807553/ /pubmed/36593280 http://dx.doi.org/10.1038/s41598-022-27264-x Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Article Chowdhury, Mohammad Ziaul Islam Leung, Alexander A. Walker, Robin L. Sikdar, Khokan C. O’Beirne, Maeve Quan, Hude Turin, Tanvir C. A comparison of machine learning algorithms and traditional regression-based statistical modeling for predicting hypertension incidence in a Canadian population
title	A comparison of machine learning algorithms and traditional regression-based statistical modeling for predicting hypertension incidence in a Canadian population
title_full	A comparison of machine learning algorithms and traditional regression-based statistical modeling for predicting hypertension incidence in a Canadian population
title_fullStr	A comparison of machine learning algorithms and traditional regression-based statistical modeling for predicting hypertension incidence in a Canadian population
title_full_unstemmed	A comparison of machine learning algorithms and traditional regression-based statistical modeling for predicting hypertension incidence in a Canadian population
title_short	A comparison of machine learning algorithms and traditional regression-based statistical modeling for predicting hypertension incidence in a Canadian population
title_sort	comparison of machine learning algorithms and traditional regression-based statistical modeling for predicting hypertension incidence in a canadian population
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9807553/ https://www.ncbi.nlm.nih.gov/pubmed/36593280 http://dx.doi.org/10.1038/s41598-022-27264-x
work_keys_str_mv	AT chowdhurymohammadziaulislam acomparisonofmachinelearningalgorithmsandtraditionalregressionbasedstatisticalmodelingforpredictinghypertensionincidenceinacanadianpopulation AT leungalexandera acomparisonofmachinelearningalgorithmsandtraditionalregressionbasedstatisticalmodelingforpredictinghypertensionincidenceinacanadianpopulation AT walkerrobinl acomparisonofmachinelearningalgorithmsandtraditionalregressionbasedstatisticalmodelingforpredictinghypertensionincidenceinacanadianpopulation AT sikdarkhokanc acomparisonofmachinelearningalgorithmsandtraditionalregressionbasedstatisticalmodelingforpredictinghypertensionincidenceinacanadianpopulation AT obeirnemaeve acomparisonofmachinelearningalgorithmsandtraditionalregressionbasedstatisticalmodelingforpredictinghypertensionincidenceinacanadianpopulation AT quanhude acomparisonofmachinelearningalgorithmsandtraditionalregressionbasedstatisticalmodelingforpredictinghypertensionincidenceinacanadianpopulation AT turintanvirc acomparisonofmachinelearningalgorithmsandtraditionalregressionbasedstatisticalmodelingforpredictinghypertensionincidenceinacanadianpopulation AT chowdhurymohammadziaulislam comparisonofmachinelearningalgorithmsandtraditionalregressionbasedstatisticalmodelingforpredictinghypertensionincidenceinacanadianpopulation AT leungalexandera comparisonofmachinelearningalgorithmsandtraditionalregressionbasedstatisticalmodelingforpredictinghypertensionincidenceinacanadianpopulation AT walkerrobinl comparisonofmachinelearningalgorithmsandtraditionalregressionbasedstatisticalmodelingforpredictinghypertensionincidenceinacanadianpopulation AT sikdarkhokanc comparisonofmachinelearningalgorithmsandtraditionalregressionbasedstatisticalmodelingforpredictinghypertensionincidenceinacanadianpopulation AT obeirnemaeve comparisonofmachinelearningalgorithmsandtraditionalregressionbasedstatisticalmodelingforpredictinghypertensionincidenceinacanadianpopulation AT quanhude comparisonofmachinelearningalgorithmsandtraditionalregressionbasedstatisticalmodelingforpredictinghypertensionincidenceinacanadianpopulation AT turintanvirc comparisonofmachinelearningalgorithmsandtraditionalregressionbasedstatisticalmodelingforpredictinghypertensionincidenceinacanadianpopulation

A comparison of machine learning algorithms and traditional regression-based statistical modeling for predicting hypertension incidence in a Canadian population

Ejemplares similares