Cargando…
Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection
PURPOSE: Heatmapping techniques can support explainability of deep learning (DL) predictions in medical image analysis. However, individual techniques have been mainly applied in a descriptive way without an objective and systematic evaluation. We investigated comparative performances using diabetic...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Association for Research in Vision and Ophthalmology
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7774113/ https://www.ncbi.nlm.nih.gov/pubmed/33403156 http://dx.doi.org/10.1167/tvst.9.2.64 |
_version_ | 1783630194204999680 |
---|---|
author | Van Craenendonck, Toon Elen, Bart Gerrits, Nele De Boever, Patrick |
author_facet | Van Craenendonck, Toon Elen, Bart Gerrits, Nele De Boever, Patrick |
author_sort | Van Craenendonck, Toon |
collection | PubMed |
description | PURPOSE: Heatmapping techniques can support explainability of deep learning (DL) predictions in medical image analysis. However, individual techniques have been mainly applied in a descriptive way without an objective and systematic evaluation. We investigated comparative performances using diabetic retinopathy lesion detection as a benchmark task. METHODS: The Indian Diabetic Retinopathy Image Dataset (IDRiD) publicly available database contains fundus images of diabetes patients with pixel level annotations of diabetic retinopathy (DR) lesions, the ground truth for this study. Three in advance trained DL models (ResNet50, VGG16 or InceptionV3) were used for DR detection in these images. Next, explainability was visualized with each of the 10 most used heatmapping techniques. The quantitative correspondence between the output of a heatmap and the ground truth was evaluated with the Explainability Consistency Score (ECS), a metric between 0 and 1, developed for this comparative task. RESULTS: In case of the overall DR lesions detection, the ECS ranged from 0.21 to 0.51 for all model/heatmapping combinations. The highest score was for VGG16+Grad-CAM (ECS = 0.51; 95% confidence interval [CI]: [0.46; 0.55]). For individual lesions, VGG16+Grad-CAM performed best on hemorrhages and hard exudates. ResNet50+SmoothGrad performed best for soft exudates and ResNet50+Guided Backpropagation performed best for microaneurysms. CONCLUSIONS: Our empirical evaluation on the IDRiD database demonstrated that the combination DL model/heatmapping affects explainability when considering common DR lesions. Our approach found considerable disagreement between regions highlighted by heatmaps and expert annotations. TRANSLATIONAL RELEVANCE: We warrant a more systematic investigation and analysis of heatmaps for reliable explanation of image-based predictions of deep learning models. |
format | Online Article Text |
id | pubmed-7774113 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | The Association for Research in Vision and Ophthalmology |
record_format | MEDLINE/PubMed |
spelling | pubmed-77741132021-01-04 Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection Van Craenendonck, Toon Elen, Bart Gerrits, Nele De Boever, Patrick Transl Vis Sci Technol Special Issue PURPOSE: Heatmapping techniques can support explainability of deep learning (DL) predictions in medical image analysis. However, individual techniques have been mainly applied in a descriptive way without an objective and systematic evaluation. We investigated comparative performances using diabetic retinopathy lesion detection as a benchmark task. METHODS: The Indian Diabetic Retinopathy Image Dataset (IDRiD) publicly available database contains fundus images of diabetes patients with pixel level annotations of diabetic retinopathy (DR) lesions, the ground truth for this study. Three in advance trained DL models (ResNet50, VGG16 or InceptionV3) were used for DR detection in these images. Next, explainability was visualized with each of the 10 most used heatmapping techniques. The quantitative correspondence between the output of a heatmap and the ground truth was evaluated with the Explainability Consistency Score (ECS), a metric between 0 and 1, developed for this comparative task. RESULTS: In case of the overall DR lesions detection, the ECS ranged from 0.21 to 0.51 for all model/heatmapping combinations. The highest score was for VGG16+Grad-CAM (ECS = 0.51; 95% confidence interval [CI]: [0.46; 0.55]). For individual lesions, VGG16+Grad-CAM performed best on hemorrhages and hard exudates. ResNet50+SmoothGrad performed best for soft exudates and ResNet50+Guided Backpropagation performed best for microaneurysms. CONCLUSIONS: Our empirical evaluation on the IDRiD database demonstrated that the combination DL model/heatmapping affects explainability when considering common DR lesions. Our approach found considerable disagreement between regions highlighted by heatmaps and expert annotations. TRANSLATIONAL RELEVANCE: We warrant a more systematic investigation and analysis of heatmaps for reliable explanation of image-based predictions of deep learning models. The Association for Research in Vision and Ophthalmology 2020-12-29 /pmc/articles/PMC7774113/ /pubmed/33403156 http://dx.doi.org/10.1167/tvst.9.2.64 Text en Copyright 2020 The Authors http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. |
spellingShingle | Special Issue Van Craenendonck, Toon Elen, Bart Gerrits, Nele De Boever, Patrick Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection |
title | Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection |
title_full | Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection |
title_fullStr | Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection |
title_full_unstemmed | Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection |
title_short | Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection |
title_sort | systematic comparison of heatmapping techniques in deep learning in the context of diabetic retinopathy lesion detection |
topic | Special Issue |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7774113/ https://www.ncbi.nlm.nih.gov/pubmed/33403156 http://dx.doi.org/10.1167/tvst.9.2.64 |
work_keys_str_mv | AT vancraenendoncktoon systematiccomparisonofheatmappingtechniquesindeeplearninginthecontextofdiabeticretinopathylesiondetection AT elenbart systematiccomparisonofheatmappingtechniquesindeeplearninginthecontextofdiabeticretinopathylesiondetection AT gerritsnele systematiccomparisonofheatmappingtechniquesindeeplearninginthecontextofdiabeticretinopathylesiondetection AT deboeverpatrick systematiccomparisonofheatmappingtechniquesindeeplearninginthecontextofdiabeticretinopathylesiondetection |