Cargando…
Generalization of Deep Learning in Digital Pathology: Experience in Breast Cancer Metastasis Detection
SIMPLE SUMMARY: Pathology is a cornerstone in cancer diagnostics, and digital pathology and artificial intelligence-driven image analysis could potentially save time and enhance diagnostic accuracy. For clinical implementation of artificial intelligence, a major question is whether the computer mode...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9659028/ https://www.ncbi.nlm.nih.gov/pubmed/36358842 http://dx.doi.org/10.3390/cancers14215424 |
_version_ | 1784830100361969664 |
---|---|
author | Jarkman, Sofia Karlberg, Micael Pocevičiūtė, Milda Bodén, Anna Bándi, Péter Litjens, Geert Lundström, Claes Treanor, Darren van der Laak, Jeroen |
author_facet | Jarkman, Sofia Karlberg, Micael Pocevičiūtė, Milda Bodén, Anna Bándi, Péter Litjens, Geert Lundström, Claes Treanor, Darren van der Laak, Jeroen |
author_sort | Jarkman, Sofia |
collection | PubMed |
description | SIMPLE SUMMARY: Pathology is a cornerstone in cancer diagnostics, and digital pathology and artificial intelligence-driven image analysis could potentially save time and enhance diagnostic accuracy. For clinical implementation of artificial intelligence, a major question is whether the computer models maintain high performance when applied to new settings. We tested the generalizability of a highly accurate deep learning model for breast cancer metastasis detection in sentinel lymph nodes from, firstly, unseen sentinel node data and, secondly, data with a small change in surgical indication, in this case lymph nodes from axillary dissections. Model performance dropped in both settings, particularly on axillary dissection nodes. Retraining of the model was needed to mitigate the performance drop. The study highlights the generalization challenge of clinical implementation of AI models, and the possibility that retraining might be necessary. ABSTRACT: Poor generalizability is a major barrier to clinical implementation of artificial intelligence in digital pathology. The aim of this study was to test the generalizability of a pretrained deep learning model to a new diagnostic setting and to a small change in surgical indication. A deep learning model for breast cancer metastases detection in sentinel lymph nodes, trained on CAMELYON multicenter data, was used as a base model, and achieved an AUC of 0.969 (95% CI 0.926–0.998) and FROC of 0.838 (95% CI 0.757–0.913) on CAMELYON16 test data. On local sentinel node data, the base model performance dropped to AUC 0.929 (95% CI 0.800–0.998) and FROC 0.744 (95% CI 0.566–0.912). On data with a change in surgical indication (axillary dissections) the base model performance indicated an even larger drop with a FROC of 0.503 (95%CI 0.201–0.911). The model was retrained with addition of local data, resulting in about a 4% increase for both AUC and FROC for sentinel nodes, and an increase of 11% in AUC and 49% in FROC for axillary nodes. Pathologist qualitative evaluation of the retrained model´s output showed no missed positive slides. False positives, false negatives and one previously undetected micro-metastasis were observed. The study highlights the generalization challenge even when using a multicenter trained model, and that a small change in indication can considerably impact the model´s performance. |
format | Online Article Text |
id | pubmed-9659028 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-96590282022-11-15 Generalization of Deep Learning in Digital Pathology: Experience in Breast Cancer Metastasis Detection Jarkman, Sofia Karlberg, Micael Pocevičiūtė, Milda Bodén, Anna Bándi, Péter Litjens, Geert Lundström, Claes Treanor, Darren van der Laak, Jeroen Cancers (Basel) Article SIMPLE SUMMARY: Pathology is a cornerstone in cancer diagnostics, and digital pathology and artificial intelligence-driven image analysis could potentially save time and enhance diagnostic accuracy. For clinical implementation of artificial intelligence, a major question is whether the computer models maintain high performance when applied to new settings. We tested the generalizability of a highly accurate deep learning model for breast cancer metastasis detection in sentinel lymph nodes from, firstly, unseen sentinel node data and, secondly, data with a small change in surgical indication, in this case lymph nodes from axillary dissections. Model performance dropped in both settings, particularly on axillary dissection nodes. Retraining of the model was needed to mitigate the performance drop. The study highlights the generalization challenge of clinical implementation of AI models, and the possibility that retraining might be necessary. ABSTRACT: Poor generalizability is a major barrier to clinical implementation of artificial intelligence in digital pathology. The aim of this study was to test the generalizability of a pretrained deep learning model to a new diagnostic setting and to a small change in surgical indication. A deep learning model for breast cancer metastases detection in sentinel lymph nodes, trained on CAMELYON multicenter data, was used as a base model, and achieved an AUC of 0.969 (95% CI 0.926–0.998) and FROC of 0.838 (95% CI 0.757–0.913) on CAMELYON16 test data. On local sentinel node data, the base model performance dropped to AUC 0.929 (95% CI 0.800–0.998) and FROC 0.744 (95% CI 0.566–0.912). On data with a change in surgical indication (axillary dissections) the base model performance indicated an even larger drop with a FROC of 0.503 (95%CI 0.201–0.911). The model was retrained with addition of local data, resulting in about a 4% increase for both AUC and FROC for sentinel nodes, and an increase of 11% in AUC and 49% in FROC for axillary nodes. Pathologist qualitative evaluation of the retrained model´s output showed no missed positive slides. False positives, false negatives and one previously undetected micro-metastasis were observed. The study highlights the generalization challenge even when using a multicenter trained model, and that a small change in indication can considerably impact the model´s performance. MDPI 2022-11-03 /pmc/articles/PMC9659028/ /pubmed/36358842 http://dx.doi.org/10.3390/cancers14215424 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Jarkman, Sofia Karlberg, Micael Pocevičiūtė, Milda Bodén, Anna Bándi, Péter Litjens, Geert Lundström, Claes Treanor, Darren van der Laak, Jeroen Generalization of Deep Learning in Digital Pathology: Experience in Breast Cancer Metastasis Detection |
title | Generalization of Deep Learning in Digital Pathology: Experience in Breast Cancer Metastasis Detection |
title_full | Generalization of Deep Learning in Digital Pathology: Experience in Breast Cancer Metastasis Detection |
title_fullStr | Generalization of Deep Learning in Digital Pathology: Experience in Breast Cancer Metastasis Detection |
title_full_unstemmed | Generalization of Deep Learning in Digital Pathology: Experience in Breast Cancer Metastasis Detection |
title_short | Generalization of Deep Learning in Digital Pathology: Experience in Breast Cancer Metastasis Detection |
title_sort | generalization of deep learning in digital pathology: experience in breast cancer metastasis detection |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9659028/ https://www.ncbi.nlm.nih.gov/pubmed/36358842 http://dx.doi.org/10.3390/cancers14215424 |
work_keys_str_mv | AT jarkmansofia generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection AT karlbergmicael generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection AT poceviciutemilda generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection AT bodenanna generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection AT bandipeter generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection AT litjensgeert generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection AT lundstromclaes generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection AT treanordarren generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection AT vanderlaakjeroen generalizationofdeeplearningindigitalpathologyexperienceinbreastcancermetastasisdetection |