Cargando…
Development of a Machine Learning Model to Predict Recurrence of Oral Tongue Squamous Cell Carcinoma
SIMPLE SUMMARY: In this study, we developed a generic framework to analyze the Surveillance, Epidemiology, and End Results (SEER) database to generate reliable machine learning (ML) prediction models for cancer recurrence. As a proof-of-concept, using 130,979 oral tongue squamous cell carcinoma pati...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10216090/ https://www.ncbi.nlm.nih.gov/pubmed/37345106 http://dx.doi.org/10.3390/cancers15102769 |
_version_ | 1785048215472570368 |
---|---|
author | Fatapour, Yasaman Abiri, Arash Kuan, Edward C. Brody, James P. |
author_facet | Fatapour, Yasaman Abiri, Arash Kuan, Edward C. Brody, James P. |
author_sort | Fatapour, Yasaman |
collection | PubMed |
description | SIMPLE SUMMARY: In this study, we developed a generic framework to analyze the Surveillance, Epidemiology, and End Results (SEER) database to generate reliable machine learning (ML) prediction models for cancer recurrence. As a proof-of-concept, using 130,979 oral tongue squamous cell carcinoma patients, we generated ML models to predict 5- and 10-year recurrence with high accuracy, recall, and precision. Thus, we demonstrate an effective framework for guiding future ML efforts in predicting cancer recurrence using the SEER database, with implications for the guidance of patient management and follow-up care. ABSTRACT: Despite diagnostic advancements, the development of reliable prognostic systems for assessing the risk of cancer recurrence still remains a challenge. In this study, we developed a novel framework to generate highly representative machine-learning prediction models for oral tongue squamous cell carcinoma (OTSCC) cancer recurrence. We identified cases of 5- and 10-year OTSCC recurrence from the SEER database. Four classification models were trained using the H(2)O ai platform, whose performances were assessed according to their accuracy, recall, precision, and the area under the curve (AUC) of their receiver operating characteristic (ROC) curves. By evaluating Shapley additive explanation contribution plots, feature importance was studied. Of the 130,979 patients studied, 36,042 (27.5%) were female, and the mean (SD) age was 58.2 (13.7) years. The Gradient Boosting Machine model performed the best, achieving 81.8% accuracy and 97.7% precision for 5-year prediction. Moreover, 10-year predictions demonstrated 80.0% accuracy and 94.0% precision. The number of prior tumors, patient age, the site of cancer recurrence, and tumor histology were the most significant predictors. The implementation of our novel SEER framework enabled the successful identification of patients with OTSCC recurrence, with which highly accurate and sensitive prediction models were generated. Thus, we demonstrate our framework’s potential for application in various cancers to build generalizable screening tools to predict tumor recurrence. |
format | Online Article Text |
id | pubmed-10216090 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-102160902023-05-27 Development of a Machine Learning Model to Predict Recurrence of Oral Tongue Squamous Cell Carcinoma Fatapour, Yasaman Abiri, Arash Kuan, Edward C. Brody, James P. Cancers (Basel) Article SIMPLE SUMMARY: In this study, we developed a generic framework to analyze the Surveillance, Epidemiology, and End Results (SEER) database to generate reliable machine learning (ML) prediction models for cancer recurrence. As a proof-of-concept, using 130,979 oral tongue squamous cell carcinoma patients, we generated ML models to predict 5- and 10-year recurrence with high accuracy, recall, and precision. Thus, we demonstrate an effective framework for guiding future ML efforts in predicting cancer recurrence using the SEER database, with implications for the guidance of patient management and follow-up care. ABSTRACT: Despite diagnostic advancements, the development of reliable prognostic systems for assessing the risk of cancer recurrence still remains a challenge. In this study, we developed a novel framework to generate highly representative machine-learning prediction models for oral tongue squamous cell carcinoma (OTSCC) cancer recurrence. We identified cases of 5- and 10-year OTSCC recurrence from the SEER database. Four classification models were trained using the H(2)O ai platform, whose performances were assessed according to their accuracy, recall, precision, and the area under the curve (AUC) of their receiver operating characteristic (ROC) curves. By evaluating Shapley additive explanation contribution plots, feature importance was studied. Of the 130,979 patients studied, 36,042 (27.5%) were female, and the mean (SD) age was 58.2 (13.7) years. The Gradient Boosting Machine model performed the best, achieving 81.8% accuracy and 97.7% precision for 5-year prediction. Moreover, 10-year predictions demonstrated 80.0% accuracy and 94.0% precision. The number of prior tumors, patient age, the site of cancer recurrence, and tumor histology were the most significant predictors. The implementation of our novel SEER framework enabled the successful identification of patients with OTSCC recurrence, with which highly accurate and sensitive prediction models were generated. Thus, we demonstrate our framework’s potential for application in various cancers to build generalizable screening tools to predict tumor recurrence. MDPI 2023-05-16 /pmc/articles/PMC10216090/ /pubmed/37345106 http://dx.doi.org/10.3390/cancers15102769 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Fatapour, Yasaman Abiri, Arash Kuan, Edward C. Brody, James P. Development of a Machine Learning Model to Predict Recurrence of Oral Tongue Squamous Cell Carcinoma |
title | Development of a Machine Learning Model to Predict Recurrence of Oral Tongue Squamous Cell Carcinoma |
title_full | Development of a Machine Learning Model to Predict Recurrence of Oral Tongue Squamous Cell Carcinoma |
title_fullStr | Development of a Machine Learning Model to Predict Recurrence of Oral Tongue Squamous Cell Carcinoma |
title_full_unstemmed | Development of a Machine Learning Model to Predict Recurrence of Oral Tongue Squamous Cell Carcinoma |
title_short | Development of a Machine Learning Model to Predict Recurrence of Oral Tongue Squamous Cell Carcinoma |
title_sort | development of a machine learning model to predict recurrence of oral tongue squamous cell carcinoma |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10216090/ https://www.ncbi.nlm.nih.gov/pubmed/37345106 http://dx.doi.org/10.3390/cancers15102769 |
work_keys_str_mv | AT fatapouryasaman developmentofamachinelearningmodeltopredictrecurrenceoforaltonguesquamouscellcarcinoma AT abiriarash developmentofamachinelearningmodeltopredictrecurrenceoforaltonguesquamouscellcarcinoma AT kuanedwardc developmentofamachinelearningmodeltopredictrecurrenceoforaltonguesquamouscellcarcinoma AT brodyjamesp developmentofamachinelearningmodeltopredictrecurrenceoforaltonguesquamouscellcarcinoma |