Cargando…

Machine Learning Model for Predicting Postoperative Survival of Patients with Colorectal Cancer

PURPOSE: Machine learning (ML) is a strong candidate for making accurate predictions, as we can use large amount of data with powerful computational algorithms. We developed a ML based model to predict survival of patients with colorectal cancer (CRC) using data from two independent datasets. MATERI...

Descripción completa

Detalles Bibliográficos
Autores principales: Osman, Mohamed Hosny, Mohamed, Reham Hosny, Sarhan, Hossam Mohamed, Park, Eun Jung, Baik, Seung Hyuk, Lee, Kang Young, Kang, Jeonghyun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Korean Cancer Association 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9016295/
https://www.ncbi.nlm.nih.gov/pubmed/34126702
http://dx.doi.org/10.4143/crt.2021.206
_version_ 1784688499761348608
author Osman, Mohamed Hosny
Mohamed, Reham Hosny
Sarhan, Hossam Mohamed
Park, Eun Jung
Baik, Seung Hyuk
Lee, Kang Young
Kang, Jeonghyun
author_facet Osman, Mohamed Hosny
Mohamed, Reham Hosny
Sarhan, Hossam Mohamed
Park, Eun Jung
Baik, Seung Hyuk
Lee, Kang Young
Kang, Jeonghyun
author_sort Osman, Mohamed Hosny
collection PubMed
description PURPOSE: Machine learning (ML) is a strong candidate for making accurate predictions, as we can use large amount of data with powerful computational algorithms. We developed a ML based model to predict survival of patients with colorectal cancer (CRC) using data from two independent datasets. MATERIALS AND METHODS: A total of 364,316 and 1,572 CRC patients were included from the Surveillance, Epidemiology, and End Results (SEER) and a Korean dataset, respectively. As SEER combines data from 18 cancer registries, internal validation was done using 18-Fold-Cross-Validation then external validation was performed by testing the trained model on the Korean dataset. Performance was evaluated using area under the receiver operating characteristic curve (AUROC), sensitivity and positive predictive values. RESULTS: Clinicopathological characteristics were significantly different between the two datasets and the SEER showed a significant lower 5-year survival rate compared to the Korean dataset (60.1% vs. 75.3%, p < 0.001). The ML-based model using the Light gradient boosting algorithm achieved a better performance in predicting 5-year-survival compared to American Joint Committee on Cancer stage (AUROC, 0.804 vs. 0.736; p < 0.001). The most important features which influenced model performance were age, number of examined lymph nodes, and tumor size. Sensitivity and positive predictive values of predicting 5-year-survival for classes including dead or alive were reported as 68.14%, 77.51% and 49.88%, 88.1% respectively in the validation set. Survival probability can be checked using the web-based survival predictor (http://colorectalcancer.pythonanywhere.com). CONCLUSION: ML-based model achieved a much better performance compared to staging in individualized estimation of survival of patients with CRC.
format Online
Article
Text
id pubmed-9016295
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Korean Cancer Association
record_format MEDLINE/PubMed
spelling pubmed-90162952022-04-27 Machine Learning Model for Predicting Postoperative Survival of Patients with Colorectal Cancer Osman, Mohamed Hosny Mohamed, Reham Hosny Sarhan, Hossam Mohamed Park, Eun Jung Baik, Seung Hyuk Lee, Kang Young Kang, Jeonghyun Cancer Res Treat Original Article PURPOSE: Machine learning (ML) is a strong candidate for making accurate predictions, as we can use large amount of data with powerful computational algorithms. We developed a ML based model to predict survival of patients with colorectal cancer (CRC) using data from two independent datasets. MATERIALS AND METHODS: A total of 364,316 and 1,572 CRC patients were included from the Surveillance, Epidemiology, and End Results (SEER) and a Korean dataset, respectively. As SEER combines data from 18 cancer registries, internal validation was done using 18-Fold-Cross-Validation then external validation was performed by testing the trained model on the Korean dataset. Performance was evaluated using area under the receiver operating characteristic curve (AUROC), sensitivity and positive predictive values. RESULTS: Clinicopathological characteristics were significantly different between the two datasets and the SEER showed a significant lower 5-year survival rate compared to the Korean dataset (60.1% vs. 75.3%, p < 0.001). The ML-based model using the Light gradient boosting algorithm achieved a better performance in predicting 5-year-survival compared to American Joint Committee on Cancer stage (AUROC, 0.804 vs. 0.736; p < 0.001). The most important features which influenced model performance were age, number of examined lymph nodes, and tumor size. Sensitivity and positive predictive values of predicting 5-year-survival for classes including dead or alive were reported as 68.14%, 77.51% and 49.88%, 88.1% respectively in the validation set. Survival probability can be checked using the web-based survival predictor (http://colorectalcancer.pythonanywhere.com). CONCLUSION: ML-based model achieved a much better performance compared to staging in individualized estimation of survival of patients with CRC. Korean Cancer Association 2022-04 2021-06-15 /pmc/articles/PMC9016295/ /pubmed/34126702 http://dx.doi.org/10.4143/crt.2021.206 Text en Copyright © 2022 by the Korean Cancer Association https://creativecommons.org/licenses/by-nc/4.0/This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Osman, Mohamed Hosny
Mohamed, Reham Hosny
Sarhan, Hossam Mohamed
Park, Eun Jung
Baik, Seung Hyuk
Lee, Kang Young
Kang, Jeonghyun
Machine Learning Model for Predicting Postoperative Survival of Patients with Colorectal Cancer
title Machine Learning Model for Predicting Postoperative Survival of Patients with Colorectal Cancer
title_full Machine Learning Model for Predicting Postoperative Survival of Patients with Colorectal Cancer
title_fullStr Machine Learning Model for Predicting Postoperative Survival of Patients with Colorectal Cancer
title_full_unstemmed Machine Learning Model for Predicting Postoperative Survival of Patients with Colorectal Cancer
title_short Machine Learning Model for Predicting Postoperative Survival of Patients with Colorectal Cancer
title_sort machine learning model for predicting postoperative survival of patients with colorectal cancer
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9016295/
https://www.ncbi.nlm.nih.gov/pubmed/34126702
http://dx.doi.org/10.4143/crt.2021.206
work_keys_str_mv AT osmanmohamedhosny machinelearningmodelforpredictingpostoperativesurvivalofpatientswithcolorectalcancer
AT mohamedrehamhosny machinelearningmodelforpredictingpostoperativesurvivalofpatientswithcolorectalcancer
AT sarhanhossammohamed machinelearningmodelforpredictingpostoperativesurvivalofpatientswithcolorectalcancer
AT parkeunjung machinelearningmodelforpredictingpostoperativesurvivalofpatientswithcolorectalcancer
AT baikseunghyuk machinelearningmodelforpredictingpostoperativesurvivalofpatientswithcolorectalcancer
AT leekangyoung machinelearningmodelforpredictingpostoperativesurvivalofpatientswithcolorectalcancer
AT kangjeonghyun machinelearningmodelforpredictingpostoperativesurvivalofpatientswithcolorectalcancer