Cargando…
Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study
BACKGROUND: Colorectal cancer (CRC) is a heterogeneous group of malignancies distinguished by distinct clinical features. The association of these features with venous thromboembolism (VTE) is yet to be clarified. Machine learning (ML) models are well suited to improve VTE prediction in CRC due to t...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
AME Publishing Company
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10007945/ https://www.ncbi.nlm.nih.gov/pubmed/36915444 http://dx.doi.org/10.21037/jgo-23-18 |
_version_ | 1784905644746211328 |
---|---|
author | Qin, Li Liang, Zhikun Xie, Jingwen Ye, Guozeng Guan, Pengcheng Huang, Yaoyao Li, Xiaoyan |
author_facet | Qin, Li Liang, Zhikun Xie, Jingwen Ye, Guozeng Guan, Pengcheng Huang, Yaoyao Li, Xiaoyan |
author_sort | Qin, Li |
collection | PubMed |
description | BACKGROUND: Colorectal cancer (CRC) is a heterogeneous group of malignancies distinguished by distinct clinical features. The association of these features with venous thromboembolism (VTE) is yet to be clarified. Machine learning (ML) models are well suited to improve VTE prediction in CRC due to their ability to receive the characteristics of a large number of features and understand the dataset to obtain implicit correlations. METHODS: Data were extracted from 4,914 patients with colorectal cancer between August 2019 and August 2022, and 1,191 patients who underwent surgery on the primary tumor site with curative intent were included. The variables analyzed included patient-level factors, cancer-level factors, and laboratory test results. Model training was conducted on 30% of the dataset using a ten-fold cross-validation method and model validation was performed using the total dataset. The primary outcome was VTE occurrence in postoperative 30 days. Six ML algorithms, including logistic regression (LR), random forest (RF), extreme gradient boosting (XGBoost), weighted support vector machine (SVM), a multilayer perception (MLP) network, and a long short-term memory (LSTM) network, were applied for model fitting. The model evaluation was based on six indicators, including receiver operating characteristic curve-area under the curve (ROC-AUC), sensitivity (SEN), specificity (SPE), positive predictive value (PPV), negative predictive value (NPV), and Brier score. Two previous VTE models (Caprini and Khorana) were used as the benchmarks. RESULTS: The incidence of postoperative VTE was 10.8%. The top ten significant predictors included lymph node metastasis, C-reactive protein, tumor grade, anemia, primary tumor location, sex, age, D-dimer level, thrombin time, and tumor stage. In our results, the XGBoost model showed the best performance, with a ROC-AUC of 0.990, a SEN of 96.9%, a SPE of 96.1% in training dataset and a ROC-AUC of 0.908, a SEN of 77.5%, a SPE of 93.7% in validation dataset. All ML models outperformed the previously developed models (Caprini and Khorana). CONCLUSIONS: This study developed postoperative VTE predictive models using six ML algorithms. The XGBoost VTE model might supply a complementary tool for clinical VTE prophylaxis decision-making and the proposed risk factors could shed some light on VTE risk stratification in CRC patients. |
format | Online Article Text |
id | pubmed-10007945 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | AME Publishing Company |
record_format | MEDLINE/PubMed |
spelling | pubmed-100079452023-03-12 Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study Qin, Li Liang, Zhikun Xie, Jingwen Ye, Guozeng Guan, Pengcheng Huang, Yaoyao Li, Xiaoyan J Gastrointest Oncol Original Article BACKGROUND: Colorectal cancer (CRC) is a heterogeneous group of malignancies distinguished by distinct clinical features. The association of these features with venous thromboembolism (VTE) is yet to be clarified. Machine learning (ML) models are well suited to improve VTE prediction in CRC due to their ability to receive the characteristics of a large number of features and understand the dataset to obtain implicit correlations. METHODS: Data were extracted from 4,914 patients with colorectal cancer between August 2019 and August 2022, and 1,191 patients who underwent surgery on the primary tumor site with curative intent were included. The variables analyzed included patient-level factors, cancer-level factors, and laboratory test results. Model training was conducted on 30% of the dataset using a ten-fold cross-validation method and model validation was performed using the total dataset. The primary outcome was VTE occurrence in postoperative 30 days. Six ML algorithms, including logistic regression (LR), random forest (RF), extreme gradient boosting (XGBoost), weighted support vector machine (SVM), a multilayer perception (MLP) network, and a long short-term memory (LSTM) network, were applied for model fitting. The model evaluation was based on six indicators, including receiver operating characteristic curve-area under the curve (ROC-AUC), sensitivity (SEN), specificity (SPE), positive predictive value (PPV), negative predictive value (NPV), and Brier score. Two previous VTE models (Caprini and Khorana) were used as the benchmarks. RESULTS: The incidence of postoperative VTE was 10.8%. The top ten significant predictors included lymph node metastasis, C-reactive protein, tumor grade, anemia, primary tumor location, sex, age, D-dimer level, thrombin time, and tumor stage. In our results, the XGBoost model showed the best performance, with a ROC-AUC of 0.990, a SEN of 96.9%, a SPE of 96.1% in training dataset and a ROC-AUC of 0.908, a SEN of 77.5%, a SPE of 93.7% in validation dataset. All ML models outperformed the previously developed models (Caprini and Khorana). CONCLUSIONS: This study developed postoperative VTE predictive models using six ML algorithms. The XGBoost VTE model might supply a complementary tool for clinical VTE prophylaxis decision-making and the proposed risk factors could shed some light on VTE risk stratification in CRC patients. AME Publishing Company 2023-02-15 2023-02-28 /pmc/articles/PMC10007945/ /pubmed/36915444 http://dx.doi.org/10.21037/jgo-23-18 Text en 2023 Journal of Gastrointestinal Oncology. All rights reserved. https://creativecommons.org/licenses/by-nc-nd/4.0/Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0 (https://creativecommons.org/licenses/by-nc-nd/4.0/) . |
spellingShingle | Original Article Qin, Li Liang, Zhikun Xie, Jingwen Ye, Guozeng Guan, Pengcheng Huang, Yaoyao Li, Xiaoyan Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study |
title | Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study |
title_full | Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study |
title_fullStr | Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study |
title_full_unstemmed | Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study |
title_short | Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study |
title_sort | development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10007945/ https://www.ncbi.nlm.nih.gov/pubmed/36915444 http://dx.doi.org/10.21037/jgo-23-18 |
work_keys_str_mv | AT qinli developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy AT liangzhikun developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy AT xiejingwen developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy AT yeguozeng developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy AT guanpengcheng developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy AT huangyaoyao developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy AT lixiaoyan developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy |