Cargando…

Predicting Colorectal Cancer Survival Using Time-to-Event Machine Learning: Retrospective Cohort Study

BACKGROUND: Machine learning (ML) methods have shown great potential in predicting colorectal cancer (CRC) survival. However, the ML models introduced thus far have mainly focused on binary outcomes and have not considered the time-to-event nature of this type of modeling. OBJECTIVE: This study aims...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yang, Xulin, Qiu, Hang, Wang, Liya, Wang, Xiaodong
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2023
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10636616/ https://www.ncbi.nlm.nih.gov/pubmed/37883174 http://dx.doi.org/10.2196/44417

_version_	1785133244649308160
author	Yang, Xulin Qiu, Hang Wang, Liya Wang, Xiaodong
author_facet	Yang, Xulin Qiu, Hang Wang, Liya Wang, Xiaodong
author_sort	Yang, Xulin
collection	PubMed
description	BACKGROUND: Machine learning (ML) methods have shown great potential in predicting colorectal cancer (CRC) survival. However, the ML models introduced thus far have mainly focused on binary outcomes and have not considered the time-to-event nature of this type of modeling. OBJECTIVE: This study aims to evaluate the performance of ML approaches for modeling time-to-event survival data and develop transparent models for predicting CRC-specific survival. METHODS: The data set used in this retrospective cohort study contains information on patients who were newly diagnosed with CRC between December 28, 2012, and December 27, 2019, at West China Hospital, Sichuan University. We assessed the performance of 6 representative ML models, including random survival forest (RSF), gradient boosting machine (GBM), DeepSurv, DeepHit, neural net-extended time-dependent Cox (or Cox-Time), and neural multitask logistic regression (N-MTLR) in predicting CRC-specific survival. Multiple imputation by chained equations method was applied to handle missing values in variables. Multivariable analysis and clinical experience were used to select significant features associated with CRC survival. Model performance was evaluated in stratified 5-fold cross-validation repeated 5 times by using the time-dependent concordance index, integrated Brier score, calibration curves, and decision curves. The SHapley Additive exPlanations method was applied to calculate feature importance. RESULTS: A total of 2157 patients with CRC were included in this study. Among the 6 time-to-event ML models, the DeepHit model exhibited the best discriminative ability (time-dependent concordance index 0.789, 95% CI 0.779-0.799) and the RSF model produced better-calibrated survival estimates (integrated Brier score 0.096, 95% CI 0.094-0.099), but these are not statistically significant. Additionally, the RSF, GBM, DeepSurv, Cox-Time, and N-MTLR models have comparable predictive accuracy to the Cox Proportional Hazards model in terms of discrimination and calibration. The calibration curves showed that all the ML models exhibited good 5-year survival calibration. The decision curves for CRC-specific survival at 5 years showed that all the ML models, especially RSF, had higher net benefits than default strategies of treating all or no patients at a range of clinically reasonable risk thresholds. The SHapley Additive exPlanations method revealed that R0 resection, tumor-node-metastasis staging, and the number of positive lymph nodes were important factors for 5-year CRC-specific survival. CONCLUSIONS: This study showed the potential of applying time-to-event ML predictive algorithms to help predict CRC-specific survival. The RSF, GBM, Cox-Time, and N-MTLR algorithms could provide nonparametric alternatives to the Cox Proportional Hazards model in estimating the survival probability of patients with CRC. The transparent time-to-event ML models help clinicians to more accurately predict the survival rate for these patients and improve patient outcomes by enabling personalized treatment plans that are informed by explainable ML models.
format	Online Article Text
id	pubmed-10636616
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-106366162023-11-11 Predicting Colorectal Cancer Survival Using Time-to-Event Machine Learning: Retrospective Cohort Study Yang, Xulin Qiu, Hang Wang, Liya Wang, Xiaodong J Med Internet Res Original Paper BACKGROUND: Machine learning (ML) methods have shown great potential in predicting colorectal cancer (CRC) survival. However, the ML models introduced thus far have mainly focused on binary outcomes and have not considered the time-to-event nature of this type of modeling. OBJECTIVE: This study aims to evaluate the performance of ML approaches for modeling time-to-event survival data and develop transparent models for predicting CRC-specific survival. METHODS: The data set used in this retrospective cohort study contains information on patients who were newly diagnosed with CRC between December 28, 2012, and December 27, 2019, at West China Hospital, Sichuan University. We assessed the performance of 6 representative ML models, including random survival forest (RSF), gradient boosting machine (GBM), DeepSurv, DeepHit, neural net-extended time-dependent Cox (or Cox-Time), and neural multitask logistic regression (N-MTLR) in predicting CRC-specific survival. Multiple imputation by chained equations method was applied to handle missing values in variables. Multivariable analysis and clinical experience were used to select significant features associated with CRC survival. Model performance was evaluated in stratified 5-fold cross-validation repeated 5 times by using the time-dependent concordance index, integrated Brier score, calibration curves, and decision curves. The SHapley Additive exPlanations method was applied to calculate feature importance. RESULTS: A total of 2157 patients with CRC were included in this study. Among the 6 time-to-event ML models, the DeepHit model exhibited the best discriminative ability (time-dependent concordance index 0.789, 95% CI 0.779-0.799) and the RSF model produced better-calibrated survival estimates (integrated Brier score 0.096, 95% CI 0.094-0.099), but these are not statistically significant. Additionally, the RSF, GBM, DeepSurv, Cox-Time, and N-MTLR models have comparable predictive accuracy to the Cox Proportional Hazards model in terms of discrimination and calibration. The calibration curves showed that all the ML models exhibited good 5-year survival calibration. The decision curves for CRC-specific survival at 5 years showed that all the ML models, especially RSF, had higher net benefits than default strategies of treating all or no patients at a range of clinically reasonable risk thresholds. The SHapley Additive exPlanations method revealed that R0 resection, tumor-node-metastasis staging, and the number of positive lymph nodes were important factors for 5-year CRC-specific survival. CONCLUSIONS: This study showed the potential of applying time-to-event ML predictive algorithms to help predict CRC-specific survival. The RSF, GBM, Cox-Time, and N-MTLR algorithms could provide nonparametric alternatives to the Cox Proportional Hazards model in estimating the survival probability of patients with CRC. The transparent time-to-event ML models help clinicians to more accurately predict the survival rate for these patients and improve patient outcomes by enabling personalized treatment plans that are informed by explainable ML models. JMIR Publications 2023-10-26 /pmc/articles/PMC10636616/ /pubmed/37883174 http://dx.doi.org/10.2196/44417 Text en ©Xulin Yang, Hang Qiu, Liya Wang, Xiaodong Wang. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 26.10.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Original Paper Yang, Xulin Qiu, Hang Wang, Liya Wang, Xiaodong Predicting Colorectal Cancer Survival Using Time-to-Event Machine Learning: Retrospective Cohort Study
title	Predicting Colorectal Cancer Survival Using Time-to-Event Machine Learning: Retrospective Cohort Study
title_full	Predicting Colorectal Cancer Survival Using Time-to-Event Machine Learning: Retrospective Cohort Study
title_fullStr	Predicting Colorectal Cancer Survival Using Time-to-Event Machine Learning: Retrospective Cohort Study
title_full_unstemmed	Predicting Colorectal Cancer Survival Using Time-to-Event Machine Learning: Retrospective Cohort Study
title_short	Predicting Colorectal Cancer Survival Using Time-to-Event Machine Learning: Retrospective Cohort Study
title_sort	predicting colorectal cancer survival using time-to-event machine learning: retrospective cohort study
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10636616/ https://www.ncbi.nlm.nih.gov/pubmed/37883174 http://dx.doi.org/10.2196/44417
work_keys_str_mv	AT yangxulin predictingcolorectalcancersurvivalusingtimetoeventmachinelearningretrospectivecohortstudy AT qiuhang predictingcolorectalcancersurvivalusingtimetoeventmachinelearningretrospectivecohortstudy AT wangliya predictingcolorectalcancersurvivalusingtimetoeventmachinelearningretrospectivecohortstudy AT wangxiaodong predictingcolorectalcancersurvivalusingtimetoeventmachinelearningretrospectivecohortstudy

Predicting Colorectal Cancer Survival Using Time-to-Event Machine Learning: Retrospective Cohort Study

Ejemplares similares