Cargando…

Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study

BACKGROUND: Colorectal cancer (CRC) is a heterogeneous group of malignancies distinguished by distinct clinical features. The association of these features with venous thromboembolism (VTE) is yet to be clarified. Machine learning (ML) models are well suited to improve VTE prediction in CRC due to t...

Descripción completa

Detalles Bibliográficos
Autores principales: Qin, Li, Liang, Zhikun, Xie, Jingwen, Ye, Guozeng, Guan, Pengcheng, Huang, Yaoyao, Li, Xiaoyan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: AME Publishing Company 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10007945/
https://www.ncbi.nlm.nih.gov/pubmed/36915444
http://dx.doi.org/10.21037/jgo-23-18
_version_ 1784905644746211328
author Qin, Li
Liang, Zhikun
Xie, Jingwen
Ye, Guozeng
Guan, Pengcheng
Huang, Yaoyao
Li, Xiaoyan
author_facet Qin, Li
Liang, Zhikun
Xie, Jingwen
Ye, Guozeng
Guan, Pengcheng
Huang, Yaoyao
Li, Xiaoyan
author_sort Qin, Li
collection PubMed
description BACKGROUND: Colorectal cancer (CRC) is a heterogeneous group of malignancies distinguished by distinct clinical features. The association of these features with venous thromboembolism (VTE) is yet to be clarified. Machine learning (ML) models are well suited to improve VTE prediction in CRC due to their ability to receive the characteristics of a large number of features and understand the dataset to obtain implicit correlations. METHODS: Data were extracted from 4,914 patients with colorectal cancer between August 2019 and August 2022, and 1,191 patients who underwent surgery on the primary tumor site with curative intent were included. The variables analyzed included patient-level factors, cancer-level factors, and laboratory test results. Model training was conducted on 30% of the dataset using a ten-fold cross-validation method and model validation was performed using the total dataset. The primary outcome was VTE occurrence in postoperative 30 days. Six ML algorithms, including logistic regression (LR), random forest (RF), extreme gradient boosting (XGBoost), weighted support vector machine (SVM), a multilayer perception (MLP) network, and a long short-term memory (LSTM) network, were applied for model fitting. The model evaluation was based on six indicators, including receiver operating characteristic curve-area under the curve (ROC-AUC), sensitivity (SEN), specificity (SPE), positive predictive value (PPV), negative predictive value (NPV), and Brier score. Two previous VTE models (Caprini and Khorana) were used as the benchmarks. RESULTS: The incidence of postoperative VTE was 10.8%. The top ten significant predictors included lymph node metastasis, C-reactive protein, tumor grade, anemia, primary tumor location, sex, age, D-dimer level, thrombin time, and tumor stage. In our results, the XGBoost model showed the best performance, with a ROC-AUC of 0.990, a SEN of 96.9%, a SPE of 96.1% in training dataset and a ROC-AUC of 0.908, a SEN of 77.5%, a SPE of 93.7% in validation dataset. All ML models outperformed the previously developed models (Caprini and Khorana). CONCLUSIONS: This study developed postoperative VTE predictive models using six ML algorithms. The XGBoost VTE model might supply a complementary tool for clinical VTE prophylaxis decision-making and the proposed risk factors could shed some light on VTE risk stratification in CRC patients.
format Online
Article
Text
id pubmed-10007945
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher AME Publishing Company
record_format MEDLINE/PubMed
spelling pubmed-100079452023-03-12 Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study Qin, Li Liang, Zhikun Xie, Jingwen Ye, Guozeng Guan, Pengcheng Huang, Yaoyao Li, Xiaoyan J Gastrointest Oncol Original Article BACKGROUND: Colorectal cancer (CRC) is a heterogeneous group of malignancies distinguished by distinct clinical features. The association of these features with venous thromboembolism (VTE) is yet to be clarified. Machine learning (ML) models are well suited to improve VTE prediction in CRC due to their ability to receive the characteristics of a large number of features and understand the dataset to obtain implicit correlations. METHODS: Data were extracted from 4,914 patients with colorectal cancer between August 2019 and August 2022, and 1,191 patients who underwent surgery on the primary tumor site with curative intent were included. The variables analyzed included patient-level factors, cancer-level factors, and laboratory test results. Model training was conducted on 30% of the dataset using a ten-fold cross-validation method and model validation was performed using the total dataset. The primary outcome was VTE occurrence in postoperative 30 days. Six ML algorithms, including logistic regression (LR), random forest (RF), extreme gradient boosting (XGBoost), weighted support vector machine (SVM), a multilayer perception (MLP) network, and a long short-term memory (LSTM) network, were applied for model fitting. The model evaluation was based on six indicators, including receiver operating characteristic curve-area under the curve (ROC-AUC), sensitivity (SEN), specificity (SPE), positive predictive value (PPV), negative predictive value (NPV), and Brier score. Two previous VTE models (Caprini and Khorana) were used as the benchmarks. RESULTS: The incidence of postoperative VTE was 10.8%. The top ten significant predictors included lymph node metastasis, C-reactive protein, tumor grade, anemia, primary tumor location, sex, age, D-dimer level, thrombin time, and tumor stage. In our results, the XGBoost model showed the best performance, with a ROC-AUC of 0.990, a SEN of 96.9%, a SPE of 96.1% in training dataset and a ROC-AUC of 0.908, a SEN of 77.5%, a SPE of 93.7% in validation dataset. All ML models outperformed the previously developed models (Caprini and Khorana). CONCLUSIONS: This study developed postoperative VTE predictive models using six ML algorithms. The XGBoost VTE model might supply a complementary tool for clinical VTE prophylaxis decision-making and the proposed risk factors could shed some light on VTE risk stratification in CRC patients. AME Publishing Company 2023-02-15 2023-02-28 /pmc/articles/PMC10007945/ /pubmed/36915444 http://dx.doi.org/10.21037/jgo-23-18 Text en 2023 Journal of Gastrointestinal Oncology. All rights reserved. https://creativecommons.org/licenses/by-nc-nd/4.0/Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0 (https://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle Original Article
Qin, Li
Liang, Zhikun
Xie, Jingwen
Ye, Guozeng
Guan, Pengcheng
Huang, Yaoyao
Li, Xiaoyan
Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study
title Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study
title_full Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study
title_fullStr Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study
title_full_unstemmed Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study
title_short Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study
title_sort development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10007945/
https://www.ncbi.nlm.nih.gov/pubmed/36915444
http://dx.doi.org/10.21037/jgo-23-18
work_keys_str_mv AT qinli developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy
AT liangzhikun developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy
AT xiejingwen developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy
AT yeguozeng developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy
AT guanpengcheng developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy
AT huangyaoyao developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy
AT lixiaoyan developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy