Cargando…

SurviveAI: Long Term Survival Prediction of Cancer Patients Based on Somatic RNA-Seq Expression

MOTIVATION: Prediction of cancer outcome is a major challenge in oncology and is essential for treatment planning. Repositories such as The Cancer Genome Atlas (TCGA) contain vast amounts of data for many types of cancers. Our goal was to create reliable prediction models using TCGA data and validat...

Descripción completa

Detalles Bibliográficos
Autores principales: Nayshool, Omri, Kol, Nitzan, Javaski, Elisheva, Amariglio, Ninette, Rechavi, Gideon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9549197/
https://www.ncbi.nlm.nih.gov/pubmed/36225330
http://dx.doi.org/10.1177/11769351221127875
Descripción
Sumario:MOTIVATION: Prediction of cancer outcome is a major challenge in oncology and is essential for treatment planning. Repositories such as The Cancer Genome Atlas (TCGA) contain vast amounts of data for many types of cancers. Our goal was to create reliable prediction models using TCGA data and validate them using an external dataset. RESULTS: For 16 TCGA cancer type cohorts we have optimized a Random Forest prediction model using parameter grid search followed by a backward feature elimination loop for dimensions reduction. For each feature that was removed, the model was retrained and the area under the curve of the receiver operating characteristic (AUC-ROC) was calculated using test data. Five prediction models gave AUC-ROC bigger than 80%. We used Clinical Proteomic Tumor Analysis Consortium v3 (CPTAC3) data for validation. The most enriched pathways for the top models were those involved in basic functions related to tumorigenesis and organ development. Enrichment for 2 prediction models of the TCGA-KIRP cohort was explored, one with 42 genes (AUC-ROC = 0.86) the other is composed of 300 genes (AUC-ROC = 0.85). The most enriched networks for both models share only 5 network nodes: DMBT1, IL11, HOXB6, TRIB3, PIM1. These genes play a significant role in renal cancer and might be used for prognosis prediction and as candidate therapeutic targets. AVAILABILITY AND IMPLEMENTATION: The prediction models were created and tested using Python SciKit-Learn package. They are freely accessible via a friendly web interface we called surviveAI at https://tinyurl.com/surviveai.