Cargando…

SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images

Deep learning has revolutionized digital pathology, allowing for automatic analysis of hematoxylin and eosin (H&E) stained whole slide images (WSIs) for diverse tasks. In such analyses, WSIs are typically broken into smaller images called tiles, and a neural network backbone encodes each tile in...

Descripción completa

Detalles Bibliográficos
Autores principales:	Mukashyaka, Patience, Sheridan, Todd B., Foroughi pour, Ali, Chuang, Jeffrey H.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Cold Spring Harbor Laboratory 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10418159/ https://www.ncbi.nlm.nih.gov/pubmed/37577691 http://dx.doi.org/10.1101/2023.08.01.551468

_version_	1785088207300329472
author	Mukashyaka, Patience Sheridan, Todd B. Foroughi pour, Ali Chuang, Jeffrey H.
author_facet	Mukashyaka, Patience Sheridan, Todd B. Foroughi pour, Ali Chuang, Jeffrey H.
author_sort	Mukashyaka, Patience
collection	PubMed
description	Deep learning has revolutionized digital pathology, allowing for automatic analysis of hematoxylin and eosin (H&E) stained whole slide images (WSIs) for diverse tasks. In such analyses, WSIs are typically broken into smaller images called tiles, and a neural network backbone encodes each tile in a feature space. Many recent works have applied attention based deep learning models to aggregate tile-level features into a slide-level representation, which is then used for slide-level prediction tasks. However, training attention models is computationally intensive, necessitating hyperparameter optimization and specialized training procedures. Here, we propose SAMPLER, a fully statistical approach to generate efficient and informative WSI representations by encoding the empirical cumulative distribution functions (CDFs) of multiscale tile features. We demonstrate that SAMPLER-based classifiers are as accurate or better than state-of-the-art fully deep learning attention models for classification tasks including distinction of: subtypes of breast carcinoma (BRCA: AUC=0.911 ± 0.029); subtypes of non-small cell lung carcinoma (NSCLC: AUC=0.940±0.018); and subtypes of renal cell carcinoma (RCC: AUC=0.987±0.006). A major advantage of the SAMPLER representation is that predictive models are >100X faster compared to attention models. Histopathological review confirms that SAMPLER-identified high attention tiles contain tumor morphological features specific to the tumor type, while low attention tiles contain fibrous stroma, blood, or tissue folding artifacts. We further apply SAMPLER concepts to improve the design of attention-based neural networks, yielding a context aware multi-head attention model with increased accuracy for subtype classification within BRCA and RCC (BRCA: AUC=0.921±0.027, and RCC: AUC=0.988±0.010). Finally, we provide theoretical results identifying sufficient conditions for which SAMPLER is optimal. SAMPLER is a fast and effective approach for analyzing WSIs, with greatly improved scalability over attention methods to benefit digital pathology analysis.
format	Online Article Text
id	pubmed-10418159
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Cold Spring Harbor Laboratory
record_format	MEDLINE/PubMed
spelling	pubmed-104181592023-08-12 SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images Mukashyaka, Patience Sheridan, Todd B. Foroughi pour, Ali Chuang, Jeffrey H. bioRxiv Article Deep learning has revolutionized digital pathology, allowing for automatic analysis of hematoxylin and eosin (H&E) stained whole slide images (WSIs) for diverse tasks. In such analyses, WSIs are typically broken into smaller images called tiles, and a neural network backbone encodes each tile in a feature space. Many recent works have applied attention based deep learning models to aggregate tile-level features into a slide-level representation, which is then used for slide-level prediction tasks. However, training attention models is computationally intensive, necessitating hyperparameter optimization and specialized training procedures. Here, we propose SAMPLER, a fully statistical approach to generate efficient and informative WSI representations by encoding the empirical cumulative distribution functions (CDFs) of multiscale tile features. We demonstrate that SAMPLER-based classifiers are as accurate or better than state-of-the-art fully deep learning attention models for classification tasks including distinction of: subtypes of breast carcinoma (BRCA: AUC=0.911 ± 0.029); subtypes of non-small cell lung carcinoma (NSCLC: AUC=0.940±0.018); and subtypes of renal cell carcinoma (RCC: AUC=0.987±0.006). A major advantage of the SAMPLER representation is that predictive models are >100X faster compared to attention models. Histopathological review confirms that SAMPLER-identified high attention tiles contain tumor morphological features specific to the tumor type, while low attention tiles contain fibrous stroma, blood, or tissue folding artifacts. We further apply SAMPLER concepts to improve the design of attention-based neural networks, yielding a context aware multi-head attention model with increased accuracy for subtype classification within BRCA and RCC (BRCA: AUC=0.921±0.027, and RCC: AUC=0.988±0.010). Finally, we provide theoretical results identifying sufficient conditions for which SAMPLER is optimal. SAMPLER is a fast and effective approach for analyzing WSIs, with greatly improved scalability over attention methods to benefit digital pathology analysis. Cold Spring Harbor Laboratory 2023-08-03 /pmc/articles/PMC10418159/ /pubmed/37577691 http://dx.doi.org/10.1101/2023.08.01.551468 Text en https://creativecommons.org/licenses/by-nc/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (https://creativecommons.org/licenses/by-nc/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle	Article Mukashyaka, Patience Sheridan, Todd B. Foroughi pour, Ali Chuang, Jeffrey H. SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images
title	SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images
title_full	SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images
title_fullStr	SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images
title_full_unstemmed	SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images
title_short	SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images
title_sort	sampler: empirical distribution representations for rapid analysis of whole slide tissue images
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10418159/ https://www.ncbi.nlm.nih.gov/pubmed/37577691 http://dx.doi.org/10.1101/2023.08.01.551468
work_keys_str_mv	AT mukashyakapatience samplerempiricaldistributionrepresentationsforrapidanalysisofwholeslidetissueimages AT sheridantoddb samplerempiricaldistributionrepresentationsforrapidanalysisofwholeslidetissueimages AT foroughipourali samplerempiricaldistributionrepresentationsforrapidanalysisofwholeslidetissueimages AT chuangjeffreyh samplerempiricaldistributionrepresentationsforrapidanalysisofwholeslidetissueimages

SAMPLER: Empirical distribution representations for rapid analysis of whole slide tissue images

Ejemplares similares