Cargando…
Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database
BACKGROUND: We determined appropriate survival prediction machine learning models for patients with oropharyngeal squamous cell carcinoma (OPSCC) using the “Surveillance, Epidemiology, and End Results” (SEER) database. METHODS: In total, 4039 patients diagnosed with OPSCC between 2004 and 2016 were...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9441569/ https://www.ncbi.nlm.nih.gov/pubmed/36072804 http://dx.doi.org/10.3389/fonc.2022.974678 |
_version_ | 1784782607436742656 |
---|---|
author | Kim, Su Il Kang, Jeong Wook Eun, Young-Gyu Lee, Young Chan |
author_facet | Kim, Su Il Kang, Jeong Wook Eun, Young-Gyu Lee, Young Chan |
author_sort | Kim, Su Il |
collection | PubMed |
description | BACKGROUND: We determined appropriate survival prediction machine learning models for patients with oropharyngeal squamous cell carcinoma (OPSCC) using the “Surveillance, Epidemiology, and End Results” (SEER) database. METHODS: In total, 4039 patients diagnosed with OPSCC between 2004 and 2016 were enrolled in this study. In particular, 13 variables were selected and analyzed: age, sex, tumor grade, tumor size, neck dissection, radiation therapy, cancer directed surgery, chemotherapy, T stage, N stage, M stage, clinical stage, and human papillomavirus (HPV) status. The T-, N-, and clinical staging were reconstructed based on the American Joint Committee on Cancer (AJCC) Staging Manual, 8th Edition. The patients were randomly assigned to a development or test dataset at a 7:3 ratio. The extremely randomized survival tree (EST), conditional survival forest (CSF), and DeepSurv models were used to predict the overall and disease-specific survival in patients with OPSCC. A 10-fold cross-validation on a development dataset was used to build the training and internal validation data for all models. We evaluated the predictive performance of each model using test datasets. RESULTS: A higher c-index value and lower integrated Brier score (IBS), root mean square error (RMSE), and mean absolute error (MAE) indicate a better performance from a machine learning model. The C-index was the highest for the DeepSurv model (0.77). The IBS was also the lowest in the DeepSurv model (0.08). However, the RMSE and RAE were the lowest for the CSF model. CONCLUSIONS: We demonstrated various machine-learning-based survival prediction models. The CSF model showed a better performance in predicting the survival of patients with OPSCC in terms of the RMSE and RAE. In this context, machine learning models based on personalized survival predictions can be used to stratify various complex risk factors. This could help in designing personalized treatments and predicting prognoses for patients. |
format | Online Article Text |
id | pubmed-9441569 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-94415692022-09-06 Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database Kim, Su Il Kang, Jeong Wook Eun, Young-Gyu Lee, Young Chan Front Oncol Oncology BACKGROUND: We determined appropriate survival prediction machine learning models for patients with oropharyngeal squamous cell carcinoma (OPSCC) using the “Surveillance, Epidemiology, and End Results” (SEER) database. METHODS: In total, 4039 patients diagnosed with OPSCC between 2004 and 2016 were enrolled in this study. In particular, 13 variables were selected and analyzed: age, sex, tumor grade, tumor size, neck dissection, radiation therapy, cancer directed surgery, chemotherapy, T stage, N stage, M stage, clinical stage, and human papillomavirus (HPV) status. The T-, N-, and clinical staging were reconstructed based on the American Joint Committee on Cancer (AJCC) Staging Manual, 8th Edition. The patients were randomly assigned to a development or test dataset at a 7:3 ratio. The extremely randomized survival tree (EST), conditional survival forest (CSF), and DeepSurv models were used to predict the overall and disease-specific survival in patients with OPSCC. A 10-fold cross-validation on a development dataset was used to build the training and internal validation data for all models. We evaluated the predictive performance of each model using test datasets. RESULTS: A higher c-index value and lower integrated Brier score (IBS), root mean square error (RMSE), and mean absolute error (MAE) indicate a better performance from a machine learning model. The C-index was the highest for the DeepSurv model (0.77). The IBS was also the lowest in the DeepSurv model (0.08). However, the RMSE and RAE were the lowest for the CSF model. CONCLUSIONS: We demonstrated various machine-learning-based survival prediction models. The CSF model showed a better performance in predicting the survival of patients with OPSCC in terms of the RMSE and RAE. In this context, machine learning models based on personalized survival predictions can be used to stratify various complex risk factors. This could help in designing personalized treatments and predicting prognoses for patients. Frontiers Media S.A. 2022-08-22 /pmc/articles/PMC9441569/ /pubmed/36072804 http://dx.doi.org/10.3389/fonc.2022.974678 Text en Copyright © 2022 Kim, Kang, Eun and Lee https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Oncology Kim, Su Il Kang, Jeong Wook Eun, Young-Gyu Lee, Young Chan Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database |
title | Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database |
title_full | Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database |
title_fullStr | Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database |
title_full_unstemmed | Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database |
title_short | Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database |
title_sort | prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: a study based on the surveillance, epidemiology, and end results database |
topic | Oncology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9441569/ https://www.ncbi.nlm.nih.gov/pubmed/36072804 http://dx.doi.org/10.3389/fonc.2022.974678 |
work_keys_str_mv | AT kimsuil predictionofsurvivalinoropharyngealsquamouscellcarcinomausingmachinelearningalgorithmsastudybasedonthesurveillanceepidemiologyandendresultsdatabase AT kangjeongwook predictionofsurvivalinoropharyngealsquamouscellcarcinomausingmachinelearningalgorithmsastudybasedonthesurveillanceepidemiologyandendresultsdatabase AT eunyounggyu predictionofsurvivalinoropharyngealsquamouscellcarcinomausingmachinelearningalgorithmsastudybasedonthesurveillanceepidemiologyandendresultsdatabase AT leeyoungchan predictionofsurvivalinoropharyngealsquamouscellcarcinomausingmachinelearningalgorithmsastudybasedonthesurveillanceepidemiologyandendresultsdatabase |