Cargando…

Robust Image Population Based Stain Color Normalization: How Many Reference Slides Are Enough?

Histopathologic evaluation of Hematoxylin & Eosin (H&E) stained slides is essential for disease diagnosis, revealing tissue morphology, structure, and cellular composition. Variations in staining protocols and equipment result in images with color nonconformity. Although pathologists compens...

Descripción completa

Detalles Bibliográficos
Formato: Online Artículo Texto
Lenguaje:English
Publicado: IEEE 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9970045/
https://www.ncbi.nlm.nih.gov/pubmed/36860498
http://dx.doi.org/10.1109/OJEMB.2023.3234443
_version_ 1784897839213576192
collection PubMed
description Histopathologic evaluation of Hematoxylin & Eosin (H&E) stained slides is essential for disease diagnosis, revealing tissue morphology, structure, and cellular composition. Variations in staining protocols and equipment result in images with color nonconformity. Although pathologists compensate for color variations, these disparities introduce inaccuracies in computational whole slide image (WSI) analysis, accentuating data domain shift and degrading generalization. Current state-of-the-art normalization methods employ a single WSI as reference, but selecting a single WSI representative of a complete WSI-cohort is infeasible, inadvertently introducing normalization bias. We seek the optimal number of slides to construct a more representative reference based on composite/aggregate of multiple H&E density histograms and stain-vectors, obtained from a randomly selected WSI population (WSI-Cohort-Subset). We utilized 1,864 IvyGAP WSIs as a WSI-cohort, and built 200 WSI-Cohort-Subsets varying in size (from 1 to 200 WSI-pairs) using randomly selected WSIs. The WSI-pairs' mean Wasserstein Distances and WSI-Cohort-Subsets' standard deviations were calculated. The Pareto Principle defined the optimal WSI-Cohort-Subset size. The WSI-cohort underwent structure-preserving color normalization using the optimal WSI-Cohort-Subset histogram and stain-vector aggregates. Numerous normalization permutations support WSI-Cohort-Subset aggregates as representative of a WSI-cohort through WSI-cohort CIELAB color space swift convergence, as a result of the law of large numbers and shown as a power law distribution. We show normalization at the optimal (Pareto Principle) WSI-Cohort-Subset size and corresponding CIELAB convergence: a) Quantitatively, using 500 WSI-cohorts; b) Quantitatively, using 8,100 WSI-regions; c) Qualitatively, using 30 cellular tumor normalization permutations. Aggregate-based stain normalization may contribute in increasing computational pathology robustness, reproducibility, and integrity.
format Online
Article
Text
id pubmed-9970045
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher IEEE
record_format MEDLINE/PubMed
spelling pubmed-99700452023-02-28 Robust Image Population Based Stain Color Normalization: How Many Reference Slides Are Enough? IEEE Open J Eng Med Biol Article Histopathologic evaluation of Hematoxylin & Eosin (H&E) stained slides is essential for disease diagnosis, revealing tissue morphology, structure, and cellular composition. Variations in staining protocols and equipment result in images with color nonconformity. Although pathologists compensate for color variations, these disparities introduce inaccuracies in computational whole slide image (WSI) analysis, accentuating data domain shift and degrading generalization. Current state-of-the-art normalization methods employ a single WSI as reference, but selecting a single WSI representative of a complete WSI-cohort is infeasible, inadvertently introducing normalization bias. We seek the optimal number of slides to construct a more representative reference based on composite/aggregate of multiple H&E density histograms and stain-vectors, obtained from a randomly selected WSI population (WSI-Cohort-Subset). We utilized 1,864 IvyGAP WSIs as a WSI-cohort, and built 200 WSI-Cohort-Subsets varying in size (from 1 to 200 WSI-pairs) using randomly selected WSIs. The WSI-pairs' mean Wasserstein Distances and WSI-Cohort-Subsets' standard deviations were calculated. The Pareto Principle defined the optimal WSI-Cohort-Subset size. The WSI-cohort underwent structure-preserving color normalization using the optimal WSI-Cohort-Subset histogram and stain-vector aggregates. Numerous normalization permutations support WSI-Cohort-Subset aggregates as representative of a WSI-cohort through WSI-cohort CIELAB color space swift convergence, as a result of the law of large numbers and shown as a power law distribution. We show normalization at the optimal (Pareto Principle) WSI-Cohort-Subset size and corresponding CIELAB convergence: a) Quantitatively, using 500 WSI-cohorts; b) Quantitatively, using 8,100 WSI-regions; c) Qualitatively, using 30 cellular tumor normalization permutations. Aggregate-based stain normalization may contribute in increasing computational pathology robustness, reproducibility, and integrity. IEEE 2023-01-05 /pmc/articles/PMC9970045/ /pubmed/36860498 http://dx.doi.org/10.1109/OJEMB.2023.3234443 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Robust Image Population Based Stain Color Normalization: How Many Reference Slides Are Enough?
title Robust Image Population Based Stain Color Normalization: How Many Reference Slides Are Enough?
title_full Robust Image Population Based Stain Color Normalization: How Many Reference Slides Are Enough?
title_fullStr Robust Image Population Based Stain Color Normalization: How Many Reference Slides Are Enough?
title_full_unstemmed Robust Image Population Based Stain Color Normalization: How Many Reference Slides Are Enough?
title_short Robust Image Population Based Stain Color Normalization: How Many Reference Slides Are Enough?
title_sort robust image population based stain color normalization: how many reference slides are enough?
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9970045/
https://www.ncbi.nlm.nih.gov/pubmed/36860498
http://dx.doi.org/10.1109/OJEMB.2023.3234443
work_keys_str_mv AT robustimagepopulationbasedstaincolornormalizationhowmanyreferenceslidesareenough
AT robustimagepopulationbasedstaincolornormalizationhowmanyreferenceslidesareenough
AT robustimagepopulationbasedstaincolornormalizationhowmanyreferenceslidesareenough
AT robustimagepopulationbasedstaincolornormalizationhowmanyreferenceslidesareenough
AT robustimagepopulationbasedstaincolornormalizationhowmanyreferenceslidesareenough
AT robustimagepopulationbasedstaincolornormalizationhowmanyreferenceslidesareenough
AT robustimagepopulationbasedstaincolornormalizationhowmanyreferenceslidesareenough
AT robustimagepopulationbasedstaincolornormalizationhowmanyreferenceslidesareenough
AT robustimagepopulationbasedstaincolornormalizationhowmanyreferenceslidesareenough
AT robustimagepopulationbasedstaincolornormalizationhowmanyreferenceslidesareenough
AT robustimagepopulationbasedstaincolornormalizationhowmanyreferenceslidesareenough
AT robustimagepopulationbasedstaincolornormalizationhowmanyreferenceslidesareenough
AT robustimagepopulationbasedstaincolornormalizationhowmanyreferenceslidesareenough