Cargando…

Weibull Regression and Machine Learning Survival Models: Methodology, Comparison, and Application to Biomedical Data Related to Cardiac Surgery

SIMPLE SUMMARY: This article proposes a comparative study between two models that can be used by researchers for the analysis of survival data: Weibull regression and random survival forest. The models are compared considering the error rate, the performance of the model through the Harrell C-index,...

Descripción completa

Detalles Bibliográficos
Autores principales: Cavalcante, Thalytta, Ospina, Raydonal, Leiva, Víctor, Cabezas, Xavier, Martin-Barreiro, Carlos
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10045304/
https://www.ncbi.nlm.nih.gov/pubmed/36979135
http://dx.doi.org/10.3390/biology12030442
_version_ 1784913569932902400
author Cavalcante, Thalytta
Ospina, Raydonal
Leiva, Víctor
Cabezas, Xavier
Martin-Barreiro, Carlos
author_facet Cavalcante, Thalytta
Ospina, Raydonal
Leiva, Víctor
Cabezas, Xavier
Martin-Barreiro, Carlos
author_sort Cavalcante, Thalytta
collection PubMed
description SIMPLE SUMMARY: This article proposes a comparative study between two models that can be used by researchers for the analysis of survival data: Weibull regression and random survival forest. The models are compared considering the error rate, the performance of the model through the Harrell C-index, and the identification of the relevant variables for survival prediction. A statistical analysis of a data set from the Heart Institute of the University of São Paulo, Brazil, has been carried out. The proposal has many applications in biology and medicine. ABSTRACT: In this article, we propose a comparative study between two models that can be used by researchers for the analysis of survival data: (i) the Weibull regression model and (ii) the random survival forest (RSF) model. The models are compared considering the error rate, the performance of the model through the Harrell C-index, and the identification of the relevant variables for survival prediction. A statistical analysis of a data set from the Heart Institute of the University of São Paulo, Brazil, has been carried out. In the study, the length of stay of patients undergoing cardiac surgery, within the operating room, was used as the response variable. The obtained results show that the RSF model has less error rate for the training and testing data sets, at 23.55% and 20.31%, respectively, than the Weibull model, which has an error rate of 23.82%. Regarding the Harrell C-index, we obtain the values 0.76, 0.79, and 0.76, for the RSF and Weibull models, respectively. After the selection procedure, the Weibull model contains variables associated with the type of protocol and type of patient being statistically significant at 5%. The RSF model chooses age, type of patient, and type of protocol as relevant variables for prediction. We employ the randomForestSRC package of the R software to perform our data analysis and computational experiments. The proposal that we present has many applications in biology and medicine, which are discussed in the conclusions of this work.
format Online
Article
Text
id pubmed-10045304
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-100453042023-03-29 Weibull Regression and Machine Learning Survival Models: Methodology, Comparison, and Application to Biomedical Data Related to Cardiac Surgery Cavalcante, Thalytta Ospina, Raydonal Leiva, Víctor Cabezas, Xavier Martin-Barreiro, Carlos Biology (Basel) Article SIMPLE SUMMARY: This article proposes a comparative study between two models that can be used by researchers for the analysis of survival data: Weibull regression and random survival forest. The models are compared considering the error rate, the performance of the model through the Harrell C-index, and the identification of the relevant variables for survival prediction. A statistical analysis of a data set from the Heart Institute of the University of São Paulo, Brazil, has been carried out. The proposal has many applications in biology and medicine. ABSTRACT: In this article, we propose a comparative study between two models that can be used by researchers for the analysis of survival data: (i) the Weibull regression model and (ii) the random survival forest (RSF) model. The models are compared considering the error rate, the performance of the model through the Harrell C-index, and the identification of the relevant variables for survival prediction. A statistical analysis of a data set from the Heart Institute of the University of São Paulo, Brazil, has been carried out. In the study, the length of stay of patients undergoing cardiac surgery, within the operating room, was used as the response variable. The obtained results show that the RSF model has less error rate for the training and testing data sets, at 23.55% and 20.31%, respectively, than the Weibull model, which has an error rate of 23.82%. Regarding the Harrell C-index, we obtain the values 0.76, 0.79, and 0.76, for the RSF and Weibull models, respectively. After the selection procedure, the Weibull model contains variables associated with the type of protocol and type of patient being statistically significant at 5%. The RSF model chooses age, type of patient, and type of protocol as relevant variables for prediction. We employ the randomForestSRC package of the R software to perform our data analysis and computational experiments. The proposal that we present has many applications in biology and medicine, which are discussed in the conclusions of this work. MDPI 2023-03-13 /pmc/articles/PMC10045304/ /pubmed/36979135 http://dx.doi.org/10.3390/biology12030442 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Cavalcante, Thalytta
Ospina, Raydonal
Leiva, Víctor
Cabezas, Xavier
Martin-Barreiro, Carlos
Weibull Regression and Machine Learning Survival Models: Methodology, Comparison, and Application to Biomedical Data Related to Cardiac Surgery
title Weibull Regression and Machine Learning Survival Models: Methodology, Comparison, and Application to Biomedical Data Related to Cardiac Surgery
title_full Weibull Regression and Machine Learning Survival Models: Methodology, Comparison, and Application to Biomedical Data Related to Cardiac Surgery
title_fullStr Weibull Regression and Machine Learning Survival Models: Methodology, Comparison, and Application to Biomedical Data Related to Cardiac Surgery
title_full_unstemmed Weibull Regression and Machine Learning Survival Models: Methodology, Comparison, and Application to Biomedical Data Related to Cardiac Surgery
title_short Weibull Regression and Machine Learning Survival Models: Methodology, Comparison, and Application to Biomedical Data Related to Cardiac Surgery
title_sort weibull regression and machine learning survival models: methodology, comparison, and application to biomedical data related to cardiac surgery
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10045304/
https://www.ncbi.nlm.nih.gov/pubmed/36979135
http://dx.doi.org/10.3390/biology12030442
work_keys_str_mv AT cavalcantethalytta weibullregressionandmachinelearningsurvivalmodelsmethodologycomparisonandapplicationtobiomedicaldatarelatedtocardiacsurgery
AT ospinaraydonal weibullregressionandmachinelearningsurvivalmodelsmethodologycomparisonandapplicationtobiomedicaldatarelatedtocardiacsurgery
AT leivavictor weibullregressionandmachinelearningsurvivalmodelsmethodologycomparisonandapplicationtobiomedicaldatarelatedtocardiacsurgery
AT cabezasxavier weibullregressionandmachinelearningsurvivalmodelsmethodologycomparisonandapplicationtobiomedicaldatarelatedtocardiacsurgery
AT martinbarreirocarlos weibullregressionandmachinelearningsurvivalmodelsmethodologycomparisonandapplicationtobiomedicaldatarelatedtocardiacsurgery