Cargando…

Interpretable survival prediction for colorectal cancer using deep learning

Deriving interpretable prognostic features from deep-learning-based prognostic histopathology models remains a challenge. In this study, we developed a deep learning system (DLS) for predicting disease-specific survival for stage II and III colorectal cancer using 3652 cases (27,300 slides). When ev...

Descripción completa

Detalles Bibliográficos
Autores principales: Wulczyn, Ellery, Steiner, David F., Moran, Melissa, Plass, Markus, Reihs, Robert, Tan, Fraser, Flament-Auvigne, Isabelle, Brown, Trissia, Regitnig, Peter, Chen, Po-Hsuan Cameron, Hegde, Narayan, Sadhwani, Apaar, MacDonald, Robert, Ayalew, Benny, Corrado, Greg S., Peng, Lily H., Tse, Daniel, Müller, Heimo, Xu, Zhaoyang, Liu, Yun, Stumpe, Martin C., Zatloukal, Kurt, Mermel, Craig H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8055695/
https://www.ncbi.nlm.nih.gov/pubmed/33875798
http://dx.doi.org/10.1038/s41746-021-00427-2
Descripción
Sumario:Deriving interpretable prognostic features from deep-learning-based prognostic histopathology models remains a challenge. In this study, we developed a deep learning system (DLS) for predicting disease-specific survival for stage II and III colorectal cancer using 3652 cases (27,300 slides). When evaluated on two validation datasets containing 1239 cases (9340 slides) and 738 cases (7140 slides), respectively, the DLS achieved a 5-year disease-specific survival AUC of 0.70 (95% CI: 0.66–0.73) and 0.69 (95% CI: 0.64–0.72), and added significant predictive value to a set of nine clinicopathologic features. To interpret the DLS, we explored the ability of different human-interpretable features to explain the variance in DLS scores. We observed that clinicopathologic features such as T-category, N-category, and grade explained a small fraction of the variance in DLS scores (R(2) = 18% in both validation sets). Next, we generated human-interpretable histologic features by clustering embeddings from a deep-learning-based image-similarity model and showed that they explained the majority of the variance (R(2) of 73–80%). Furthermore, the clustering-derived feature most strongly associated with high DLS scores was also highly prognostic in isolation. With a distinct visual appearance (poorly differentiated tumor cell clusters adjacent to adipose tissue), this feature was identified by annotators with 87.0–95.5% accuracy. Our approach can be used to explain predictions from a prognostic deep learning model and uncover potentially-novel prognostic features that can be reliably identified by people for future validation studies.