Cargando…

Deep learning-based subtyping of gastric cancer histology predicts clinical outcome: a multi-institutional retrospective study

INTRODUCTION: The Laurén classification is widely used for Gastric Cancer (GC) histology subtyping. However, this classification is prone to interobserver variability and its prognostic value remains controversial. Deep Learning (DL)-based assessment of hematoxylin and eosin (H&E) stained slides...

Descripción completa

Detalles Bibliográficos
Autores principales: Veldhuizen, Gregory Patrick, Röcken, Christoph, Behrens, Hans-Michael, Cifci, Didem, Muti, Hannah Sophie, Yoshikawa, Takaki, Arai, Tomio, Oshima, Takashi, Tan, Patrick, Ebert, Matthias P., Pearson, Alexander T., Calderaro, Julien, Grabsch, Heike I., Kather, Jakob Nikolas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Nature Singapore 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10361890/
https://www.ncbi.nlm.nih.gov/pubmed/37269416
http://dx.doi.org/10.1007/s10120-023-01398-x
_version_ 1785076307182223360
author Veldhuizen, Gregory Patrick
Röcken, Christoph
Behrens, Hans-Michael
Cifci, Didem
Muti, Hannah Sophie
Yoshikawa, Takaki
Arai, Tomio
Oshima, Takashi
Tan, Patrick
Ebert, Matthias P.
Pearson, Alexander T.
Calderaro, Julien
Grabsch, Heike I.
Kather, Jakob Nikolas
author_facet Veldhuizen, Gregory Patrick
Röcken, Christoph
Behrens, Hans-Michael
Cifci, Didem
Muti, Hannah Sophie
Yoshikawa, Takaki
Arai, Tomio
Oshima, Takashi
Tan, Patrick
Ebert, Matthias P.
Pearson, Alexander T.
Calderaro, Julien
Grabsch, Heike I.
Kather, Jakob Nikolas
author_sort Veldhuizen, Gregory Patrick
collection PubMed
description INTRODUCTION: The Laurén classification is widely used for Gastric Cancer (GC) histology subtyping. However, this classification is prone to interobserver variability and its prognostic value remains controversial. Deep Learning (DL)-based assessment of hematoxylin and eosin (H&E) stained slides is a potentially useful tool to provide an additional layer of clinically relevant information, but has not been systematically assessed in GC. OBJECTIVE: We aimed to train, test and externally validate a deep learning-based classifier for GC histology subtyping using routine H&E stained tissue sections from gastric adenocarcinomas and to assess its potential prognostic utility. METHODS: We trained a binary classifier on intestinal and diffuse type GC whole slide images for a subset of the TCGA cohort (N = 166) using attention-based multiple instance learning. The ground truth of 166 GC was obtained by two expert pathologists. We deployed the model on two external GC patient cohorts, one from Europe (N = 322) and one from Japan (N = 243). We assessed classification performance using the Area Under the Receiver Operating Characteristic Curve (AUROC) and prognostic value (overall, cancer specific and disease free survival) of the DL-based classifier with uni- and multivariate Cox proportional hazard models and Kaplan–Meier curves with log-rank test statistics. RESULTS: Internal validation using the TCGA GC cohort using five-fold cross-validation achieved a mean AUROC of 0.93 ± 0.07. External validation showed that the DL-based classifier can better stratify GC patients' 5-year survival compared to pathologist-based Laurén classification for all survival endpoints, despite frequently divergent model-pathologist classifications. Univariate overall survival Hazard Ratios (HRs) of pathologist-based Laurén classification (diffuse type versus intestinal type) were 1.14 (95% Confidence Interval (CI) 0.66–1.44, p-value = 0.51) and 1.23 (95% CI 0.96–1.43, p-value = 0.09) in the Japanese and European cohorts, respectively. DL-based histology classification resulted in HR of 1.46 (95% CI 1.18–1.65, p-value < 0.005) and 1.41 (95% CI 1.20–1.57, p-value < 0.005), in the Japanese and European cohorts, respectively. In diffuse type GC (as defined by the pathologist), classifying patients using the DL diffuse and intestinal classifications provided a superior survival stratification, and demonstrated statistically significant survival stratification when combined with pathologist classification for both the Asian (overall survival log-rank test p-value < 0.005, HR 1.43 (95% CI 1.05–1.66, p-value = 0.03) and European cohorts (overall survival log-rank test p-value < 0.005, HR 1.56 (95% CI 1.16–1.76, p-value < 0.005)). CONCLUSION: Our study shows that gastric adenocarcinoma subtyping using pathologist’s Laurén classification as ground truth can be performed using current state of the art DL techniques. Patient survival stratification seems to be better by DL-based histology typing compared with expert pathologist histology typing. DL-based GC histology typing has potential as an aid in subtyping. Further investigations are warranted to fully understand the underlying biological mechanisms for the improved survival stratification despite apparent imperfect classification by the DL algorithm. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10120-023-01398-x.
format Online
Article
Text
id pubmed-10361890
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Springer Nature Singapore
record_format MEDLINE/PubMed
spelling pubmed-103618902023-07-23 Deep learning-based subtyping of gastric cancer histology predicts clinical outcome: a multi-institutional retrospective study Veldhuizen, Gregory Patrick Röcken, Christoph Behrens, Hans-Michael Cifci, Didem Muti, Hannah Sophie Yoshikawa, Takaki Arai, Tomio Oshima, Takashi Tan, Patrick Ebert, Matthias P. Pearson, Alexander T. Calderaro, Julien Grabsch, Heike I. Kather, Jakob Nikolas Gastric Cancer Original Article INTRODUCTION: The Laurén classification is widely used for Gastric Cancer (GC) histology subtyping. However, this classification is prone to interobserver variability and its prognostic value remains controversial. Deep Learning (DL)-based assessment of hematoxylin and eosin (H&E) stained slides is a potentially useful tool to provide an additional layer of clinically relevant information, but has not been systematically assessed in GC. OBJECTIVE: We aimed to train, test and externally validate a deep learning-based classifier for GC histology subtyping using routine H&E stained tissue sections from gastric adenocarcinomas and to assess its potential prognostic utility. METHODS: We trained a binary classifier on intestinal and diffuse type GC whole slide images for a subset of the TCGA cohort (N = 166) using attention-based multiple instance learning. The ground truth of 166 GC was obtained by two expert pathologists. We deployed the model on two external GC patient cohorts, one from Europe (N = 322) and one from Japan (N = 243). We assessed classification performance using the Area Under the Receiver Operating Characteristic Curve (AUROC) and prognostic value (overall, cancer specific and disease free survival) of the DL-based classifier with uni- and multivariate Cox proportional hazard models and Kaplan–Meier curves with log-rank test statistics. RESULTS: Internal validation using the TCGA GC cohort using five-fold cross-validation achieved a mean AUROC of 0.93 ± 0.07. External validation showed that the DL-based classifier can better stratify GC patients' 5-year survival compared to pathologist-based Laurén classification for all survival endpoints, despite frequently divergent model-pathologist classifications. Univariate overall survival Hazard Ratios (HRs) of pathologist-based Laurén classification (diffuse type versus intestinal type) were 1.14 (95% Confidence Interval (CI) 0.66–1.44, p-value = 0.51) and 1.23 (95% CI 0.96–1.43, p-value = 0.09) in the Japanese and European cohorts, respectively. DL-based histology classification resulted in HR of 1.46 (95% CI 1.18–1.65, p-value < 0.005) and 1.41 (95% CI 1.20–1.57, p-value < 0.005), in the Japanese and European cohorts, respectively. In diffuse type GC (as defined by the pathologist), classifying patients using the DL diffuse and intestinal classifications provided a superior survival stratification, and demonstrated statistically significant survival stratification when combined with pathologist classification for both the Asian (overall survival log-rank test p-value < 0.005, HR 1.43 (95% CI 1.05–1.66, p-value = 0.03) and European cohorts (overall survival log-rank test p-value < 0.005, HR 1.56 (95% CI 1.16–1.76, p-value < 0.005)). CONCLUSION: Our study shows that gastric adenocarcinoma subtyping using pathologist’s Laurén classification as ground truth can be performed using current state of the art DL techniques. Patient survival stratification seems to be better by DL-based histology typing compared with expert pathologist histology typing. DL-based GC histology typing has potential as an aid in subtyping. Further investigations are warranted to fully understand the underlying biological mechanisms for the improved survival stratification despite apparent imperfect classification by the DL algorithm. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10120-023-01398-x. Springer Nature Singapore 2023-06-03 2023 /pmc/articles/PMC10361890/ /pubmed/37269416 http://dx.doi.org/10.1007/s10120-023-01398-x Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Original Article
Veldhuizen, Gregory Patrick
Röcken, Christoph
Behrens, Hans-Michael
Cifci, Didem
Muti, Hannah Sophie
Yoshikawa, Takaki
Arai, Tomio
Oshima, Takashi
Tan, Patrick
Ebert, Matthias P.
Pearson, Alexander T.
Calderaro, Julien
Grabsch, Heike I.
Kather, Jakob Nikolas
Deep learning-based subtyping of gastric cancer histology predicts clinical outcome: a multi-institutional retrospective study
title Deep learning-based subtyping of gastric cancer histology predicts clinical outcome: a multi-institutional retrospective study
title_full Deep learning-based subtyping of gastric cancer histology predicts clinical outcome: a multi-institutional retrospective study
title_fullStr Deep learning-based subtyping of gastric cancer histology predicts clinical outcome: a multi-institutional retrospective study
title_full_unstemmed Deep learning-based subtyping of gastric cancer histology predicts clinical outcome: a multi-institutional retrospective study
title_short Deep learning-based subtyping of gastric cancer histology predicts clinical outcome: a multi-institutional retrospective study
title_sort deep learning-based subtyping of gastric cancer histology predicts clinical outcome: a multi-institutional retrospective study
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10361890/
https://www.ncbi.nlm.nih.gov/pubmed/37269416
http://dx.doi.org/10.1007/s10120-023-01398-x
work_keys_str_mv AT veldhuizengregorypatrick deeplearningbasedsubtypingofgastriccancerhistologypredictsclinicaloutcomeamultiinstitutionalretrospectivestudy
AT rockenchristoph deeplearningbasedsubtypingofgastriccancerhistologypredictsclinicaloutcomeamultiinstitutionalretrospectivestudy
AT behrenshansmichael deeplearningbasedsubtypingofgastriccancerhistologypredictsclinicaloutcomeamultiinstitutionalretrospectivestudy
AT cifcididem deeplearningbasedsubtypingofgastriccancerhistologypredictsclinicaloutcomeamultiinstitutionalretrospectivestudy
AT mutihannahsophie deeplearningbasedsubtypingofgastriccancerhistologypredictsclinicaloutcomeamultiinstitutionalretrospectivestudy
AT yoshikawatakaki deeplearningbasedsubtypingofgastriccancerhistologypredictsclinicaloutcomeamultiinstitutionalretrospectivestudy
AT araitomio deeplearningbasedsubtypingofgastriccancerhistologypredictsclinicaloutcomeamultiinstitutionalretrospectivestudy
AT oshimatakashi deeplearningbasedsubtypingofgastriccancerhistologypredictsclinicaloutcomeamultiinstitutionalretrospectivestudy
AT tanpatrick deeplearningbasedsubtypingofgastriccancerhistologypredictsclinicaloutcomeamultiinstitutionalretrospectivestudy
AT ebertmatthiasp deeplearningbasedsubtypingofgastriccancerhistologypredictsclinicaloutcomeamultiinstitutionalretrospectivestudy
AT pearsonalexandert deeplearningbasedsubtypingofgastriccancerhistologypredictsclinicaloutcomeamultiinstitutionalretrospectivestudy
AT calderarojulien deeplearningbasedsubtypingofgastriccancerhistologypredictsclinicaloutcomeamultiinstitutionalretrospectivestudy
AT grabschheikei deeplearningbasedsubtypingofgastriccancerhistologypredictsclinicaloutcomeamultiinstitutionalretrospectivestudy
AT katherjakobnikolas deeplearningbasedsubtypingofgastriccancerhistologypredictsclinicaloutcomeamultiinstitutionalretrospectivestudy