Cargando…

Generalization of Deep Learning in Digital Pathology: Experience in Breast Cancer Metastasis Detection

SIMPLE SUMMARY: Pathology is a cornerstone in cancer diagnostics, and digital pathology and artificial intelligence-driven image analysis could potentially save time and enhance diagnostic accuracy. For clinical implementation of artificial intelligence, a major question is whether the computer mode...

Descripción completa

Detalles Bibliográficos
Autores principales: Jarkman, Sofia, Karlberg, Micael, Pocevičiūtė, Milda, Bodén, Anna, Bándi, Péter, Litjens, Geert, Lundström, Claes, Treanor, Darren, van der Laak, Jeroen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9659028/
https://www.ncbi.nlm.nih.gov/pubmed/36358842
http://dx.doi.org/10.3390/cancers14215424
_version_ 1784830100361969664
author Jarkman, Sofia
Karlberg, Micael
Pocevičiūtė, Milda
Bodén, Anna
Bándi, Péter
Litjens, Geert
Lundström, Claes
Treanor, Darren
van der Laak, Jeroen
author_facet Jarkman, Sofia
Karlberg, Micael
Pocevičiūtė, Milda
Bodén, Anna
Bándi, Péter
Litjens, Geert
Lundström, Claes
Treanor, Darren
van der Laak, Jeroen
author_sort Jarkman, Sofia
collection PubMed
description SIMPLE SUMMARY: Pathology is a cornerstone in cancer diagnostics, and digital pathology and artificial intelligence-driven image analysis could potentially save time and enhance diagnostic accuracy. For clinical implementation of artificial intelligence, a major question is whether the computer models maintain high performance when applied to new settings. We tested the generalizability of a highly accurate deep learning model for breast cancer metastasis detection in sentinel lymph nodes from, firstly, unseen sentinel node data and, secondly, data with a small change in surgical indication, in this case lymph nodes from axillary dissections. Model performance dropped in both settings, particularly on axillary dissection nodes. Retraining of the model was needed to mitigate the performance drop. The study highlights the generalization challenge of clinical implementation of AI models, and the possibility that retraining might be necessary. ABSTRACT: Poor generalizability is a major barrier to clinical implementation of artificial intelligence in digital pathology. The aim of this study was to test the generalizability of a pretrained deep learning model to a new diagnostic setting and to a small change in surgical indication. A deep learning model for breast cancer metastases detection in sentinel lymph nodes, trained on CAMELYON multicenter data, was used as a base model, and achieved an AUC of 0.969 (95% CI 0.926–0.998) and FROC of 0.838 (95% CI 0.757–0.913) on CAMELYON16 test data. On local sentinel node data, the base model performance dropped to AUC 0.929 (95% CI 0.800–0.998) and FROC 0.744 (95% CI 0.566–0.912). On data with a change in surgical indication (axillary dissections) the base model performance indicated an even larger drop with a FROC of 0.503 (95%CI 0.201–0.911). The model was retrained with addition of local data, resulting in about a 4% increase for both AUC and FROC for sentinel nodes, and an increase of 11% in AUC and 49% in FROC for axillary nodes. Pathologist qualitative evaluation of the retrained model´s output showed no missed positive slides. False positives, false negatives and one previously undetected micro-metastasis were observed. The study highlights the generalization challenge even when using a multicenter trained model, and that a small change in indication can considerably impact the model´s performance.
format Online
Article
Text
id pubmed-9659028
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-96590282022-11-15 Generalization of Deep Learning in Digital Pathology: Experience in Breast Cancer Metastasis Detection Jarkman, Sofia Karlberg, Micael Pocevičiūtė, Milda Bodén, Anna Bándi, Péter Litjens, Geert Lundström, Claes Treanor, Darren van der Laak, Jeroen Cancers (Basel) Article SIMPLE SUMMARY: Pathology is a cornerstone in cancer diagnostics, and digital pathology and artificial intelligence-driven image analysis could potentially save time and enhance diagnostic accuracy. For clinical implementation of artificial intelligence, a major question is whether the computer models maintain high performance when applied to new settings. We tested the generalizability of a highly accurate deep learning model for breast cancer metastasis detection in sentinel lymph nodes from, firstly, unseen sentinel node data and, secondly, data with a small change in surgical indication, in this case lymph nodes from axillary dissections. Model performance dropped in both settings, particularly on axillary dissection nodes. Retraining of the model was needed to mitigate the performance drop. The study highlights the generalization challenge of clinical implementation of AI models, and the possibility that retraining might be necessary. ABSTRACT: Poor generalizability is a major barrier to clinical implementation of artificial intelligence in digital pathology. The aim of this study was to test the generalizability of a pretrained deep learning model to a new diagnostic setting and to a small change in surgical indication. A deep learning model for breast cancer metastases detection in sentinel lymph nodes, trained on CAMELYON multicenter data, was used as a base model, and achieved an AUC of 0.969 (95% CI 0.926–0.998) and FROC of 0.838 (95% CI 0.757–0.913) on CAMELYON16 test data. On local sentinel node data, the base model performance dropped to AUC 0.929 (95% CI 0.800–0.998) and FROC 0.744 (95% CI 0.566–0.912). On data with a change in surgical indication (axillary dissections) the base model performance indicated an even larger drop with a FROC of 0.503 (95%CI 0.201–0.911). The model was retrained with addition of local data, resulting in about a 4% increase for both AUC and FROC for sentinel nodes, and an increase of 11% in AUC and 49% in FROC for axillary nodes. Pathologist qualitative evaluation of the retrained model´s output showed no missed positive slides. False positives, false negatives and one previously undetected micro-metastasis were observed. The study highlights the generalization challenge even when using a multicenter trained model, and that a small change in indication can considerably impact the model´s performance. MDPI 2022-11-03 /pmc/articles/PMC9659028/ /pubmed/36358842 http://dx.doi.org/10.3390/cancers14215424 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Jarkman, Sofia
Karlberg, Micael
Pocevičiūtė, Milda
Bodén, Anna
Bándi, Péter
Litjens, Geert
Lundström, Claes
Treanor, Darren
van der Laak, Jeroen
Generalization of Deep Learning in Digital Pathology: Experience in Breast Cancer Metastasis Detection
title Generalization of Deep Learning in Digital Pathology: Experience in Breast Cancer Metastasis Detection
title_full Generalization of Deep Learning in Digital Pathology: Experience in Breast Cancer Metastasis Detection
title_fullStr Generalization of Deep Learning in Digital Pathology: Experience in Breast Cancer Metastasis Detection
title_full_unstemmed Generalization of Deep Learning in Digital Pathology: Experience in Breast Cancer Metastasis Detection
title_short Generalization of Deep Learning in Digital Pathology: Experience in Breast Cancer Metastasis Detection
title_sort generalization of deep learning in digital pathology: experience in breast cancer metastasis detection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9659028/
https://www.ncbi.nlm.nih.gov/pubmed/36358842
http://dx.doi.org/10.3390/cancers14215424
work_keys_str_mv AT jarkmansofia generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection
AT karlbergmicael generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection
AT poceviciutemilda generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection
AT bodenanna generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection
AT bandipeter generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection
AT litjensgeert generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection
AT lundstromclaes generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection
AT treanordarren generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection
AT vanderlaakjeroen generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection