Cargando…

A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data

BACKGROUND: Random survival forest (RSF) models have been identified as alternative methods to the Cox proportional hazards model in analysing time-to-event data. These methods, however, have been criticised for the bias that results from favouring covariates with many split-points and hence conditi...

Descripción completa

Detalles Bibliográficos
Autores principales: Nasejje, Justine B., Mwambi, Henry, Dheda, Keertan, Lesosky, Maia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5534080/
https://www.ncbi.nlm.nih.gov/pubmed/28754093
http://dx.doi.org/10.1186/s12874-017-0383-8
_version_ 1783253721607569408
author Nasejje, Justine B.
Mwambi, Henry
Dheda, Keertan
Lesosky, Maia
author_facet Nasejje, Justine B.
Mwambi, Henry
Dheda, Keertan
Lesosky, Maia
author_sort Nasejje, Justine B.
collection PubMed
description BACKGROUND: Random survival forest (RSF) models have been identified as alternative methods to the Cox proportional hazards model in analysing time-to-event data. These methods, however, have been criticised for the bias that results from favouring covariates with many split-points and hence conditional inference forests for time-to-event data have been suggested. Conditional inference forests (CIF) are known to correct the bias in RSF models by separating the procedure for the best covariate to split on from that of the best split point search for the selected covariate. METHODS: In this study, we compare the random survival forest model to the conditional inference model (CIF) using twenty-two simulated time-to-event datasets. We also analysed two real time-to-event datasets. The first dataset is based on the survival of children under-five years of age in Uganda and it consists of categorical covariates with most of them having more than two levels (many split-points). The second dataset is based on the survival of patients with extremely drug resistant tuberculosis (XDR TB) which consists of mainly categorical covariates with two levels (few split-points). RESULTS: The study findings indicate that the conditional inference forest model is superior to random survival forest models in analysing time-to-event data that consists of covariates with many split-points based on the values of the bootstrap cross-validated estimates for integrated Brier scores. However, conditional inference forests perform comparably similar to random survival forests models in analysing time-to-event data consisting of covariates with fewer split-points. CONCLUSION: Although survival forests are promising methods in analysing time-to-event data, it is important to identify the best forest model for analysis based on the nature of covariates of the dataset in question.
format Online
Article
Text
id pubmed-5534080
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-55340802017-08-03 A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data Nasejje, Justine B. Mwambi, Henry Dheda, Keertan Lesosky, Maia BMC Med Res Methodol Research Article BACKGROUND: Random survival forest (RSF) models have been identified as alternative methods to the Cox proportional hazards model in analysing time-to-event data. These methods, however, have been criticised for the bias that results from favouring covariates with many split-points and hence conditional inference forests for time-to-event data have been suggested. Conditional inference forests (CIF) are known to correct the bias in RSF models by separating the procedure for the best covariate to split on from that of the best split point search for the selected covariate. METHODS: In this study, we compare the random survival forest model to the conditional inference model (CIF) using twenty-two simulated time-to-event datasets. We also analysed two real time-to-event datasets. The first dataset is based on the survival of children under-five years of age in Uganda and it consists of categorical covariates with most of them having more than two levels (many split-points). The second dataset is based on the survival of patients with extremely drug resistant tuberculosis (XDR TB) which consists of mainly categorical covariates with two levels (few split-points). RESULTS: The study findings indicate that the conditional inference forest model is superior to random survival forest models in analysing time-to-event data that consists of covariates with many split-points based on the values of the bootstrap cross-validated estimates for integrated Brier scores. However, conditional inference forests perform comparably similar to random survival forests models in analysing time-to-event data consisting of covariates with fewer split-points. CONCLUSION: Although survival forests are promising methods in analysing time-to-event data, it is important to identify the best forest model for analysis based on the nature of covariates of the dataset in question. BioMed Central 2017-07-28 /pmc/articles/PMC5534080/ /pubmed/28754093 http://dx.doi.org/10.1186/s12874-017-0383-8 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Nasejje, Justine B.
Mwambi, Henry
Dheda, Keertan
Lesosky, Maia
A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data
title A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data
title_full A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data
title_fullStr A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data
title_full_unstemmed A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data
title_short A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data
title_sort comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5534080/
https://www.ncbi.nlm.nih.gov/pubmed/28754093
http://dx.doi.org/10.1186/s12874-017-0383-8
work_keys_str_mv AT nasejjejustineb acomparisonoftheconditionalinferencesurvivalforestmodeltorandomsurvivalforestsbasedonasimulationstudyaswellasontwoapplicationswithtimetoeventdata
AT mwambihenry acomparisonoftheconditionalinferencesurvivalforestmodeltorandomsurvivalforestsbasedonasimulationstudyaswellasontwoapplicationswithtimetoeventdata
AT dhedakeertan acomparisonoftheconditionalinferencesurvivalforestmodeltorandomsurvivalforestsbasedonasimulationstudyaswellasontwoapplicationswithtimetoeventdata
AT lesoskymaia acomparisonoftheconditionalinferencesurvivalforestmodeltorandomsurvivalforestsbasedonasimulationstudyaswellasontwoapplicationswithtimetoeventdata
AT nasejjejustineb comparisonoftheconditionalinferencesurvivalforestmodeltorandomsurvivalforestsbasedonasimulationstudyaswellasontwoapplicationswithtimetoeventdata
AT mwambihenry comparisonoftheconditionalinferencesurvivalforestmodeltorandomsurvivalforestsbasedonasimulationstudyaswellasontwoapplicationswithtimetoeventdata
AT dhedakeertan comparisonoftheconditionalinferencesurvivalforestmodeltorandomsurvivalforestsbasedonasimulationstudyaswellasontwoapplicationswithtimetoeventdata
AT lesoskymaia comparisonoftheconditionalinferencesurvivalforestmodeltorandomsurvivalforestsbasedonasimulationstudyaswellasontwoapplicationswithtimetoeventdata