Cargando…

Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study

BACKGROUND: Colorectal cancer (CRC) is a heterogeneous group of malignancies distinguished by distinct clinical features. The association of these features with venous thromboembolism (VTE) is yet to be clarified. Machine learning (ML) models are well suited to improve VTE prediction in CRC due to t...

Descripción completa

Detalles Bibliográficos
Autores principales:	Qin, Li, Liang, Zhikun, Xie, Jingwen, Ye, Guozeng, Guan, Pengcheng, Huang, Yaoyao, Li, Xiaoyan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	AME Publishing Company 2023
Materias:	Original Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10007945/ https://www.ncbi.nlm.nih.gov/pubmed/36915444 http://dx.doi.org/10.21037/jgo-23-18

_version_	1784905644746211328
author	Qin, Li Liang, Zhikun Xie, Jingwen Ye, Guozeng Guan, Pengcheng Huang, Yaoyao Li, Xiaoyan
author_facet	Qin, Li Liang, Zhikun Xie, Jingwen Ye, Guozeng Guan, Pengcheng Huang, Yaoyao Li, Xiaoyan
author_sort	Qin, Li
collection	PubMed
description	BACKGROUND: Colorectal cancer (CRC) is a heterogeneous group of malignancies distinguished by distinct clinical features. The association of these features with venous thromboembolism (VTE) is yet to be clarified. Machine learning (ML) models are well suited to improve VTE prediction in CRC due to their ability to receive the characteristics of a large number of features and understand the dataset to obtain implicit correlations. METHODS: Data were extracted from 4,914 patients with colorectal cancer between August 2019 and August 2022, and 1,191 patients who underwent surgery on the primary tumor site with curative intent were included. The variables analyzed included patient-level factors, cancer-level factors, and laboratory test results. Model training was conducted on 30% of the dataset using a ten-fold cross-validation method and model validation was performed using the total dataset. The primary outcome was VTE occurrence in postoperative 30 days. Six ML algorithms, including logistic regression (LR), random forest (RF), extreme gradient boosting (XGBoost), weighted support vector machine (SVM), a multilayer perception (MLP) network, and a long short-term memory (LSTM) network, were applied for model fitting. The model evaluation was based on six indicators, including receiver operating characteristic curve-area under the curve (ROC-AUC), sensitivity (SEN), specificity (SPE), positive predictive value (PPV), negative predictive value (NPV), and Brier score. Two previous VTE models (Caprini and Khorana) were used as the benchmarks. RESULTS: The incidence of postoperative VTE was 10.8%. The top ten significant predictors included lymph node metastasis, C-reactive protein, tumor grade, anemia, primary tumor location, sex, age, D-dimer level, thrombin time, and tumor stage. In our results, the XGBoost model showed the best performance, with a ROC-AUC of 0.990, a SEN of 96.9%, a SPE of 96.1% in training dataset and a ROC-AUC of 0.908, a SEN of 77.5%, a SPE of 93.7% in validation dataset. All ML models outperformed the previously developed models (Caprini and Khorana). CONCLUSIONS: This study developed postoperative VTE predictive models using six ML algorithms. The XGBoost VTE model might supply a complementary tool for clinical VTE prophylaxis decision-making and the proposed risk factors could shed some light on VTE risk stratification in CRC patients.
format	Online Article Text
id	pubmed-10007945
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	AME Publishing Company
record_format	MEDLINE/PubMed
spelling	pubmed-100079452023-03-12 Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study Qin, Li Liang, Zhikun Xie, Jingwen Ye, Guozeng Guan, Pengcheng Huang, Yaoyao Li, Xiaoyan J Gastrointest Oncol Original Article BACKGROUND: Colorectal cancer (CRC) is a heterogeneous group of malignancies distinguished by distinct clinical features. The association of these features with venous thromboembolism (VTE) is yet to be clarified. Machine learning (ML) models are well suited to improve VTE prediction in CRC due to their ability to receive the characteristics of a large number of features and understand the dataset to obtain implicit correlations. METHODS: Data were extracted from 4,914 patients with colorectal cancer between August 2019 and August 2022, and 1,191 patients who underwent surgery on the primary tumor site with curative intent were included. The variables analyzed included patient-level factors, cancer-level factors, and laboratory test results. Model training was conducted on 30% of the dataset using a ten-fold cross-validation method and model validation was performed using the total dataset. The primary outcome was VTE occurrence in postoperative 30 days. Six ML algorithms, including logistic regression (LR), random forest (RF), extreme gradient boosting (XGBoost), weighted support vector machine (SVM), a multilayer perception (MLP) network, and a long short-term memory (LSTM) network, were applied for model fitting. The model evaluation was based on six indicators, including receiver operating characteristic curve-area under the curve (ROC-AUC), sensitivity (SEN), specificity (SPE), positive predictive value (PPV), negative predictive value (NPV), and Brier score. Two previous VTE models (Caprini and Khorana) were used as the benchmarks. RESULTS: The incidence of postoperative VTE was 10.8%. The top ten significant predictors included lymph node metastasis, C-reactive protein, tumor grade, anemia, primary tumor location, sex, age, D-dimer level, thrombin time, and tumor stage. In our results, the XGBoost model showed the best performance, with a ROC-AUC of 0.990, a SEN of 96.9%, a SPE of 96.1% in training dataset and a ROC-AUC of 0.908, a SEN of 77.5%, a SPE of 93.7% in validation dataset. All ML models outperformed the previously developed models (Caprini and Khorana). CONCLUSIONS: This study developed postoperative VTE predictive models using six ML algorithms. The XGBoost VTE model might supply a complementary tool for clinical VTE prophylaxis decision-making and the proposed risk factors could shed some light on VTE risk stratification in CRC patients. AME Publishing Company 2023-02-15 2023-02-28 /pmc/articles/PMC10007945/ /pubmed/36915444 http://dx.doi.org/10.21037/jgo-23-18 Text en 2023 Journal of Gastrointestinal Oncology. All rights reserved. https://creativecommons.org/licenses/by-nc-nd/4.0/Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0 (https://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle	Original Article Qin, Li Liang, Zhikun Xie, Jingwen Ye, Guozeng Guan, Pengcheng Huang, Yaoyao Li, Xiaoyan Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study
title	Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study
title_full	Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study
title_fullStr	Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study
title_full_unstemmed	Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study
title_short	Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study
title_sort	development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study
topic	Original Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10007945/ https://www.ncbi.nlm.nih.gov/pubmed/36915444 http://dx.doi.org/10.21037/jgo-23-18
work_keys_str_mv	AT qinli developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy AT liangzhikun developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy AT xiejingwen developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy AT yeguozeng developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy AT guanpengcheng developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy AT huangyaoyao developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy AT lixiaoyan developmentandvalidationofmachinelearningmodelsforpostoperativevenousthromboembolismpredictionincolorectalcancerinpatientsaretrospectivestudy

Development and validation of machine learning models for postoperative venous thromboembolism prediction in colorectal cancer inpatients: a retrospective study

Ejemplares similares