Cargando…

SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images

Deep learning has revolutionized digital pathology, allowing for automatic analysis of hematoxylin and eosin (H&E) stained whole slide images (WSIs) for diverse tasks. In such analyses, WSIs are typically broken into smaller images called tiles, and a neural network backbone encodes each tile in...

Descripción completa

Detalles Bibliográficos
Autores principales: Mukashyaka, Patience, Sheridan, Todd B., Foroughi pour, Ali, Chuang, Jeffrey H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10418159/
https://www.ncbi.nlm.nih.gov/pubmed/37577691
http://dx.doi.org/10.1101/2023.08.01.551468
_version_ 1785088207300329472
author Mukashyaka, Patience
Sheridan, Todd B.
Foroughi pour, Ali
Chuang, Jeffrey H.
author_facet Mukashyaka, Patience
Sheridan, Todd B.
Foroughi pour, Ali
Chuang, Jeffrey H.
author_sort Mukashyaka, Patience
collection PubMed
description Deep learning has revolutionized digital pathology, allowing for automatic analysis of hematoxylin and eosin (H&E) stained whole slide images (WSIs) for diverse tasks. In such analyses, WSIs are typically broken into smaller images called tiles, and a neural network backbone encodes each tile in a feature space. Many recent works have applied attention based deep learning models to aggregate tile-level features into a slide-level representation, which is then used for slide-level prediction tasks. However, training attention models is computationally intensive, necessitating hyperparameter optimization and specialized training procedures. Here, we propose SAMPLER, a fully statistical approach to generate efficient and informative WSI representations by encoding the empirical cumulative distribution functions (CDFs) of multiscale tile features. We demonstrate that SAMPLER-based classifiers are as accurate or better than state-of-the-art fully deep learning attention models for classification tasks including distinction of: subtypes of breast carcinoma (BRCA: AUC=0.911 ± 0.029); subtypes of non-small cell lung carcinoma (NSCLC: AUC=0.940±0.018); and subtypes of renal cell carcinoma (RCC: AUC=0.987±0.006). A major advantage of the SAMPLER representation is that predictive models are >100X faster compared to attention models. Histopathological review confirms that SAMPLER-identified high attention tiles contain tumor morphological features specific to the tumor type, while low attention tiles contain fibrous stroma, blood, or tissue folding artifacts. We further apply SAMPLER concepts to improve the design of attention-based neural networks, yielding a context aware multi-head attention model with increased accuracy for subtype classification within BRCA and RCC (BRCA: AUC=0.921±0.027, and RCC: AUC=0.988±0.010). Finally, we provide theoretical results identifying sufficient conditions for which SAMPLER is optimal. SAMPLER is a fast and effective approach for analyzing WSIs, with greatly improved scalability over attention methods to benefit digital pathology analysis.
format Online
Article
Text
id pubmed-10418159
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-104181592023-08-12 SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images Mukashyaka, Patience Sheridan, Todd B. Foroughi pour, Ali Chuang, Jeffrey H. bioRxiv Article Deep learning has revolutionized digital pathology, allowing for automatic analysis of hematoxylin and eosin (H&E) stained whole slide images (WSIs) for diverse tasks. In such analyses, WSIs are typically broken into smaller images called tiles, and a neural network backbone encodes each tile in a feature space. Many recent works have applied attention based deep learning models to aggregate tile-level features into a slide-level representation, which is then used for slide-level prediction tasks. However, training attention models is computationally intensive, necessitating hyperparameter optimization and specialized training procedures. Here, we propose SAMPLER, a fully statistical approach to generate efficient and informative WSI representations by encoding the empirical cumulative distribution functions (CDFs) of multiscale tile features. We demonstrate that SAMPLER-based classifiers are as accurate or better than state-of-the-art fully deep learning attention models for classification tasks including distinction of: subtypes of breast carcinoma (BRCA: AUC=0.911 ± 0.029); subtypes of non-small cell lung carcinoma (NSCLC: AUC=0.940±0.018); and subtypes of renal cell carcinoma (RCC: AUC=0.987±0.006). A major advantage of the SAMPLER representation is that predictive models are >100X faster compared to attention models. Histopathological review confirms that SAMPLER-identified high attention tiles contain tumor morphological features specific to the tumor type, while low attention tiles contain fibrous stroma, blood, or tissue folding artifacts. We further apply SAMPLER concepts to improve the design of attention-based neural networks, yielding a context aware multi-head attention model with increased accuracy for subtype classification within BRCA and RCC (BRCA: AUC=0.921±0.027, and RCC: AUC=0.988±0.010). Finally, we provide theoretical results identifying sufficient conditions for which SAMPLER is optimal. SAMPLER is a fast and effective approach for analyzing WSIs, with greatly improved scalability over attention methods to benefit digital pathology analysis. Cold Spring Harbor Laboratory 2023-08-03 /pmc/articles/PMC10418159/ /pubmed/37577691 http://dx.doi.org/10.1101/2023.08.01.551468 Text en https://creativecommons.org/licenses/by-nc/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (https://creativecommons.org/licenses/by-nc/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Mukashyaka, Patience
Sheridan, Todd B.
Foroughi pour, Ali
Chuang, Jeffrey H.
SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images
title SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images
title_full SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images
title_fullStr SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images
title_full_unstemmed SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images
title_short SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images
title_sort sampler: empirical distribution representations for rapid analysis of whole slide tissue images
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10418159/
https://www.ncbi.nlm.nih.gov/pubmed/37577691
http://dx.doi.org/10.1101/2023.08.01.551468
work_keys_str_mv AT mukashyakapatience samplerempiricaldistributionrepresentationsforrapidanalysisofwholeslidetissueimages
AT sheridantoddb samplerempiricaldistributionrepresentationsforrapidanalysisofwholeslidetissueimages
AT foroughipourali samplerempiricaldistributionrepresentationsforrapidanalysisofwholeslidetissueimages
AT chuangjeffreyh samplerempiricaldistributionrepresentationsforrapidanalysisofwholeslidetissueimages