Cargando…

Development of a Machine Learning Model to Predict Recurrence of Oral Tongue Squamous Cell Carcinoma

SIMPLE SUMMARY: In this study, we developed a generic framework to analyze the Surveillance, Epidemiology, and End Results (SEER) database to generate reliable machine learning (ML) prediction models for cancer recurrence. As a proof-of-concept, using 130,979 oral tongue squamous cell carcinoma pati...

Descripción completa

Detalles Bibliográficos
Autores principales: Fatapour, Yasaman, Abiri, Arash, Kuan, Edward C., Brody, James P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10216090/
https://www.ncbi.nlm.nih.gov/pubmed/37345106
http://dx.doi.org/10.3390/cancers15102769
_version_ 1785048215472570368
author Fatapour, Yasaman
Abiri, Arash
Kuan, Edward C.
Brody, James P.
author_facet Fatapour, Yasaman
Abiri, Arash
Kuan, Edward C.
Brody, James P.
author_sort Fatapour, Yasaman
collection PubMed
description SIMPLE SUMMARY: In this study, we developed a generic framework to analyze the Surveillance, Epidemiology, and End Results (SEER) database to generate reliable machine learning (ML) prediction models for cancer recurrence. As a proof-of-concept, using 130,979 oral tongue squamous cell carcinoma patients, we generated ML models to predict 5- and 10-year recurrence with high accuracy, recall, and precision. Thus, we demonstrate an effective framework for guiding future ML efforts in predicting cancer recurrence using the SEER database, with implications for the guidance of patient management and follow-up care. ABSTRACT: Despite diagnostic advancements, the development of reliable prognostic systems for assessing the risk of cancer recurrence still remains a challenge. In this study, we developed a novel framework to generate highly representative machine-learning prediction models for oral tongue squamous cell carcinoma (OTSCC) cancer recurrence. We identified cases of 5- and 10-year OTSCC recurrence from the SEER database. Four classification models were trained using the H(2)O ai platform, whose performances were assessed according to their accuracy, recall, precision, and the area under the curve (AUC) of their receiver operating characteristic (ROC) curves. By evaluating Shapley additive explanation contribution plots, feature importance was studied. Of the 130,979 patients studied, 36,042 (27.5%) were female, and the mean (SD) age was 58.2 (13.7) years. The Gradient Boosting Machine model performed the best, achieving 81.8% accuracy and 97.7% precision for 5-year prediction. Moreover, 10-year predictions demonstrated 80.0% accuracy and 94.0% precision. The number of prior tumors, patient age, the site of cancer recurrence, and tumor histology were the most significant predictors. The implementation of our novel SEER framework enabled the successful identification of patients with OTSCC recurrence, with which highly accurate and sensitive prediction models were generated. Thus, we demonstrate our framework’s potential for application in various cancers to build generalizable screening tools to predict tumor recurrence.
format Online
Article
Text
id pubmed-10216090
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-102160902023-05-27 Development of a Machine Learning Model to Predict Recurrence of Oral Tongue Squamous Cell Carcinoma Fatapour, Yasaman Abiri, Arash Kuan, Edward C. Brody, James P. Cancers (Basel) Article SIMPLE SUMMARY: In this study, we developed a generic framework to analyze the Surveillance, Epidemiology, and End Results (SEER) database to generate reliable machine learning (ML) prediction models for cancer recurrence. As a proof-of-concept, using 130,979 oral tongue squamous cell carcinoma patients, we generated ML models to predict 5- and 10-year recurrence with high accuracy, recall, and precision. Thus, we demonstrate an effective framework for guiding future ML efforts in predicting cancer recurrence using the SEER database, with implications for the guidance of patient management and follow-up care. ABSTRACT: Despite diagnostic advancements, the development of reliable prognostic systems for assessing the risk of cancer recurrence still remains a challenge. In this study, we developed a novel framework to generate highly representative machine-learning prediction models for oral tongue squamous cell carcinoma (OTSCC) cancer recurrence. We identified cases of 5- and 10-year OTSCC recurrence from the SEER database. Four classification models were trained using the H(2)O ai platform, whose performances were assessed according to their accuracy, recall, precision, and the area under the curve (AUC) of their receiver operating characteristic (ROC) curves. By evaluating Shapley additive explanation contribution plots, feature importance was studied. Of the 130,979 patients studied, 36,042 (27.5%) were female, and the mean (SD) age was 58.2 (13.7) years. The Gradient Boosting Machine model performed the best, achieving 81.8% accuracy and 97.7% precision for 5-year prediction. Moreover, 10-year predictions demonstrated 80.0% accuracy and 94.0% precision. The number of prior tumors, patient age, the site of cancer recurrence, and tumor histology were the most significant predictors. The implementation of our novel SEER framework enabled the successful identification of patients with OTSCC recurrence, with which highly accurate and sensitive prediction models were generated. Thus, we demonstrate our framework’s potential for application in various cancers to build generalizable screening tools to predict tumor recurrence. MDPI 2023-05-16 /pmc/articles/PMC10216090/ /pubmed/37345106 http://dx.doi.org/10.3390/cancers15102769 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Fatapour, Yasaman
Abiri, Arash
Kuan, Edward C.
Brody, James P.
Development of a Machine Learning Model to Predict Recurrence of Oral Tongue Squamous Cell Carcinoma
title Development of a Machine Learning Model to Predict Recurrence of Oral Tongue Squamous Cell Carcinoma
title_full Development of a Machine Learning Model to Predict Recurrence of Oral Tongue Squamous Cell Carcinoma
title_fullStr Development of a Machine Learning Model to Predict Recurrence of Oral Tongue Squamous Cell Carcinoma
title_full_unstemmed Development of a Machine Learning Model to Predict Recurrence of Oral Tongue Squamous Cell Carcinoma
title_short Development of a Machine Learning Model to Predict Recurrence of Oral Tongue Squamous Cell Carcinoma
title_sort development of a machine learning model to predict recurrence of oral tongue squamous cell carcinoma
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10216090/
https://www.ncbi.nlm.nih.gov/pubmed/37345106
http://dx.doi.org/10.3390/cancers15102769
work_keys_str_mv AT fatapouryasaman developmentofamachinelearningmodeltopredictrecurrenceoforaltonguesquamouscellcarcinoma
AT abiriarash developmentofamachinelearningmodeltopredictrecurrenceoforaltonguesquamouscellcarcinoma
AT kuanedwardc developmentofamachinelearningmodeltopredictrecurrenceoforaltonguesquamouscellcarcinoma
AT brodyjamesp developmentofamachinelearningmodeltopredictrecurrenceoforaltonguesquamouscellcarcinoma