Cargando…
Biomarkers of Tumor Heterogeneity in Glioblastoma Multiforme Cohort of TCGA
SIMPLE SUMMARY: Identifying biomarkers of survival from a large-scale cohort of Glioblastoma Multiforme (GBM) pathology images is hindered by heterogeneity of tumor signature compounded by age being the single most important confounder in predicting survival in GBM. The main contributions of this ma...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10137245/ https://www.ncbi.nlm.nih.gov/pubmed/37190318 http://dx.doi.org/10.3390/cancers15082387 |
Sumario: | SIMPLE SUMMARY: Identifying biomarkers of survival from a large-scale cohort of Glioblastoma Multiforme (GBM) pathology images is hindered by heterogeneity of tumor signature compounded by age being the single most important confounder in predicting survival in GBM. The main contributions of this manuscript are to define (i) metrics for identifying tumor subtypes of tumor heterogeneity and (ii) relevant statistics for incorporating age for evaluating competing hypotheses. As a result, the GBM cohort are stratified based on interpretable morphometric features with or without preconditioning on published genomic subtypes. ABSTRACT: Tumor Whole Slide Images (WSI) are often heterogeneous, which hinders the discovery of biomarkers in the presence of confounding clinical factors. In this study, we present a pipeline for identifying biomarkers from the Glioblastoma Multiforme (GBM) cohort of WSIs from TCGA archive. The GBM cohort endures many technical artifacts while the discovery of GBM biomarkers is challenged because “age” is the single most confounding factor for predicting outcomes. The proposed approach relies on interpretable features (e.g., nuclear morphometric indices), effective similarity metrics for heterogeneity analysis, and robust statistics for identifying biomarkers. The pipeline first removes artifacts (e.g., pen marks) and partitions each WSI into patches for nuclear segmentation via an extended U-Net for subsequent quantitative representation. Given the variations in fixation and staining that can artificially modulate hematoxylin optical density (HOD), we extended Navab’s Lab method to normalize images and reduce the impact of batch effects. The heterogeneity of each WSI is then represented either as probability density functions (PDF) per patient or as the composition of a dictionary predicted from the entire cohort of WSIs. For PDF- or dictionary-based methods, morphometric subtypes are constructed based on distances computed from optimal transport and linkage analysis or consensus clustering with Euclidean distances, respectively. For each inferred subtype, Kaplan–Meier and/or the Cox regression model are used to regress the survival time. Since age is the single most important confounder for predicting survival in GBM and there is an observed violation of the proportionality assumption in the Cox model, we use both age and age-squared coupled with the Likelihood ratio test and forest plots for evaluating competing statistics. Next, the PDF- and dictionary-based methods are combined to identify biomarkers that are predictive of survival. The combined model has the advantage of integrating global (e.g., cohort scale) and local (e.g., patient scale) attributes of morphometric heterogeneity, coupled with robust statistics, to reveal stable biomarkers. The results indicate that, after normalization of the GBM cohort, mean HOD, eccentricity, and cellularity are predictive of survival. Finally, we also stratified the GBM cohort as a function of EGFR expression and published genomic subtypes to reveal genomic-dependent morphometric biomarkers. |
---|