Cargando…

Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database

BACKGROUND: We determined appropriate survival prediction machine learning models for patients with oropharyngeal squamous cell carcinoma (OPSCC) using the “Surveillance, Epidemiology, and End Results” (SEER) database. METHODS: In total, 4039 patients diagnosed with OPSCC between 2004 and 2016 were...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Su Il, Kang, Jeong Wook, Eun, Young-Gyu, Lee, Young Chan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9441569/
https://www.ncbi.nlm.nih.gov/pubmed/36072804
http://dx.doi.org/10.3389/fonc.2022.974678
_version_ 1784782607436742656
author Kim, Su Il
Kang, Jeong Wook
Eun, Young-Gyu
Lee, Young Chan
author_facet Kim, Su Il
Kang, Jeong Wook
Eun, Young-Gyu
Lee, Young Chan
author_sort Kim, Su Il
collection PubMed
description BACKGROUND: We determined appropriate survival prediction machine learning models for patients with oropharyngeal squamous cell carcinoma (OPSCC) using the “Surveillance, Epidemiology, and End Results” (SEER) database. METHODS: In total, 4039 patients diagnosed with OPSCC between 2004 and 2016 were enrolled in this study. In particular, 13 variables were selected and analyzed: age, sex, tumor grade, tumor size, neck dissection, radiation therapy, cancer directed surgery, chemotherapy, T stage, N stage, M stage, clinical stage, and human papillomavirus (HPV) status. The T-, N-, and clinical staging were reconstructed based on the American Joint Committee on Cancer (AJCC) Staging Manual, 8th Edition. The patients were randomly assigned to a development or test dataset at a 7:3 ratio. The extremely randomized survival tree (EST), conditional survival forest (CSF), and DeepSurv models were used to predict the overall and disease-specific survival in patients with OPSCC. A 10-fold cross-validation on a development dataset was used to build the training and internal validation data for all models. We evaluated the predictive performance of each model using test datasets. RESULTS: A higher c-index value and lower integrated Brier score (IBS), root mean square error (RMSE), and mean absolute error (MAE) indicate a better performance from a machine learning model. The C-index was the highest for the DeepSurv model (0.77). The IBS was also the lowest in the DeepSurv model (0.08). However, the RMSE and RAE were the lowest for the CSF model. CONCLUSIONS: We demonstrated various machine-learning-based survival prediction models. The CSF model showed a better performance in predicting the survival of patients with OPSCC in terms of the RMSE and RAE. In this context, machine learning models based on personalized survival predictions can be used to stratify various complex risk factors. This could help in designing personalized treatments and predicting prognoses for patients.
format Online
Article
Text
id pubmed-9441569
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-94415692022-09-06 Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database Kim, Su Il Kang, Jeong Wook Eun, Young-Gyu Lee, Young Chan Front Oncol Oncology BACKGROUND: We determined appropriate survival prediction machine learning models for patients with oropharyngeal squamous cell carcinoma (OPSCC) using the “Surveillance, Epidemiology, and End Results” (SEER) database. METHODS: In total, 4039 patients diagnosed with OPSCC between 2004 and 2016 were enrolled in this study. In particular, 13 variables were selected and analyzed: age, sex, tumor grade, tumor size, neck dissection, radiation therapy, cancer directed surgery, chemotherapy, T stage, N stage, M stage, clinical stage, and human papillomavirus (HPV) status. The T-, N-, and clinical staging were reconstructed based on the American Joint Committee on Cancer (AJCC) Staging Manual, 8th Edition. The patients were randomly assigned to a development or test dataset at a 7:3 ratio. The extremely randomized survival tree (EST), conditional survival forest (CSF), and DeepSurv models were used to predict the overall and disease-specific survival in patients with OPSCC. A 10-fold cross-validation on a development dataset was used to build the training and internal validation data for all models. We evaluated the predictive performance of each model using test datasets. RESULTS: A higher c-index value and lower integrated Brier score (IBS), root mean square error (RMSE), and mean absolute error (MAE) indicate a better performance from a machine learning model. The C-index was the highest for the DeepSurv model (0.77). The IBS was also the lowest in the DeepSurv model (0.08). However, the RMSE and RAE were the lowest for the CSF model. CONCLUSIONS: We demonstrated various machine-learning-based survival prediction models. The CSF model showed a better performance in predicting the survival of patients with OPSCC in terms of the RMSE and RAE. In this context, machine learning models based on personalized survival predictions can be used to stratify various complex risk factors. This could help in designing personalized treatments and predicting prognoses for patients. Frontiers Media S.A. 2022-08-22 /pmc/articles/PMC9441569/ /pubmed/36072804 http://dx.doi.org/10.3389/fonc.2022.974678 Text en Copyright © 2022 Kim, Kang, Eun and Lee https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Oncology
Kim, Su Il
Kang, Jeong Wook
Eun, Young-Gyu
Lee, Young Chan
Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database
title Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database
title_full Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database
title_fullStr Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database
title_full_unstemmed Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database
title_short Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database
title_sort prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: a study based on the surveillance, epidemiology, and end results database
topic Oncology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9441569/
https://www.ncbi.nlm.nih.gov/pubmed/36072804
http://dx.doi.org/10.3389/fonc.2022.974678
work_keys_str_mv AT kimsuil predictionofsurvivalinoropharyngealsquamouscellcarcinomausingmachinelearningalgorithmsastudybasedonthesurveillanceepidemiologyandendresultsdatabase
AT kangjeongwook predictionofsurvivalinoropharyngealsquamouscellcarcinomausingmachinelearningalgorithmsastudybasedonthesurveillanceepidemiologyandendresultsdatabase
AT eunyounggyu predictionofsurvivalinoropharyngealsquamouscellcarcinomausingmachinelearningalgorithmsastudybasedonthesurveillanceepidemiologyandendresultsdatabase
AT leeyoungchan predictionofsurvivalinoropharyngealsquamouscellcarcinomausingmachinelearningalgorithmsastudybasedonthesurveillanceepidemiologyandendresultsdatabase