Cargando…

Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection

PURPOSE: Heatmapping techniques can support explainability of deep learning (DL) predictions in medical image analysis. However, individual techniques have been mainly applied in a descriptive way without an objective and systematic evaluation. We investigated comparative performances using diabetic...

Descripción completa

Detalles Bibliográficos
Autores principales: Van Craenendonck, Toon, Elen, Bart, Gerrits, Nele, De Boever, Patrick
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Association for Research in Vision and Ophthalmology 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7774113/
https://www.ncbi.nlm.nih.gov/pubmed/33403156
http://dx.doi.org/10.1167/tvst.9.2.64
_version_ 1783630194204999680
author Van Craenendonck, Toon
Elen, Bart
Gerrits, Nele
De Boever, Patrick
author_facet Van Craenendonck, Toon
Elen, Bart
Gerrits, Nele
De Boever, Patrick
author_sort Van Craenendonck, Toon
collection PubMed
description PURPOSE: Heatmapping techniques can support explainability of deep learning (DL) predictions in medical image analysis. However, individual techniques have been mainly applied in a descriptive way without an objective and systematic evaluation. We investigated comparative performances using diabetic retinopathy lesion detection as a benchmark task. METHODS: The Indian Diabetic Retinopathy Image Dataset (IDRiD) publicly available database contains fundus images of diabetes patients with pixel level annotations of diabetic retinopathy (DR) lesions, the ground truth for this study. Three in advance trained DL models (ResNet50, VGG16 or InceptionV3) were used for DR detection in these images. Next, explainability was visualized with each of the 10 most used heatmapping techniques. The quantitative correspondence between the output of a heatmap and the ground truth was evaluated with the Explainability Consistency Score (ECS), a metric between 0 and 1, developed for this comparative task. RESULTS: In case of the overall DR lesions detection, the ECS ranged from 0.21 to 0.51 for all model/heatmapping combinations. The highest score was for VGG16+Grad-CAM (ECS = 0.51; 95% confidence interval [CI]: [0.46; 0.55]). For individual lesions, VGG16+Grad-CAM performed best on hemorrhages and hard exudates. ResNet50+SmoothGrad performed best for soft exudates and ResNet50+Guided Backpropagation performed best for microaneurysms. CONCLUSIONS: Our empirical evaluation on the IDRiD database demonstrated that the combination DL model/heatmapping affects explainability when considering common DR lesions. Our approach found considerable disagreement between regions highlighted by heatmaps and expert annotations. TRANSLATIONAL RELEVANCE: We warrant a more systematic investigation and analysis of heatmaps for reliable explanation of image-based predictions of deep learning models.
format Online
Article
Text
id pubmed-7774113
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher The Association for Research in Vision and Ophthalmology
record_format MEDLINE/PubMed
spelling pubmed-77741132021-01-04 Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection Van Craenendonck, Toon Elen, Bart Gerrits, Nele De Boever, Patrick Transl Vis Sci Technol Special Issue PURPOSE: Heatmapping techniques can support explainability of deep learning (DL) predictions in medical image analysis. However, individual techniques have been mainly applied in a descriptive way without an objective and systematic evaluation. We investigated comparative performances using diabetic retinopathy lesion detection as a benchmark task. METHODS: The Indian Diabetic Retinopathy Image Dataset (IDRiD) publicly available database contains fundus images of diabetes patients with pixel level annotations of diabetic retinopathy (DR) lesions, the ground truth for this study. Three in advance trained DL models (ResNet50, VGG16 or InceptionV3) were used for DR detection in these images. Next, explainability was visualized with each of the 10 most used heatmapping techniques. The quantitative correspondence between the output of a heatmap and the ground truth was evaluated with the Explainability Consistency Score (ECS), a metric between 0 and 1, developed for this comparative task. RESULTS: In case of the overall DR lesions detection, the ECS ranged from 0.21 to 0.51 for all model/heatmapping combinations. The highest score was for VGG16+Grad-CAM (ECS = 0.51; 95% confidence interval [CI]: [0.46; 0.55]). For individual lesions, VGG16+Grad-CAM performed best on hemorrhages and hard exudates. ResNet50+SmoothGrad performed best for soft exudates and ResNet50+Guided Backpropagation performed best for microaneurysms. CONCLUSIONS: Our empirical evaluation on the IDRiD database demonstrated that the combination DL model/heatmapping affects explainability when considering common DR lesions. Our approach found considerable disagreement between regions highlighted by heatmaps and expert annotations. TRANSLATIONAL RELEVANCE: We warrant a more systematic investigation and analysis of heatmaps for reliable explanation of image-based predictions of deep learning models. The Association for Research in Vision and Ophthalmology 2020-12-29 /pmc/articles/PMC7774113/ /pubmed/33403156 http://dx.doi.org/10.1167/tvst.9.2.64 Text en Copyright 2020 The Authors http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License.
spellingShingle Special Issue
Van Craenendonck, Toon
Elen, Bart
Gerrits, Nele
De Boever, Patrick
Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection
title Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection
title_full Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection
title_fullStr Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection
title_full_unstemmed Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection
title_short Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection
title_sort systematic comparison of heatmapping techniques in deep learning in the context of diabetic retinopathy lesion detection
topic Special Issue
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7774113/
https://www.ncbi.nlm.nih.gov/pubmed/33403156
http://dx.doi.org/10.1167/tvst.9.2.64
work_keys_str_mv AT vancraenendoncktoon systematiccomparisonofheatmappingtechniquesindeeplearninginthecontextofdiabeticretinopathylesiondetection
AT elenbart systematiccomparisonofheatmappingtechniquesindeeplearninginthecontextofdiabeticretinopathylesiondetection
AT gerritsnele systematiccomparisonofheatmappingtechniquesindeeplearninginthecontextofdiabeticretinopathylesiondetection
AT deboeverpatrick systematiccomparisonofheatmappingtechniquesindeeplearninginthecontextofdiabeticretinopathylesiondetection