Cargando…
Loss-Based Attention for Interpreting Image-Level Prediction of Convolutional Neural Networks
Although deep neural networks have achieved great success on numerous large-scale tasks, poor interpretability is still a notorious obstacle for practical applications. In this paper, we propose a novel and general attention mechanism, loss-based attention, upon which we modify deep neural networks...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9531187/ https://www.ncbi.nlm.nih.gov/pubmed/33382655 http://dx.doi.org/10.1109/TIP.2020.3046875 |
_version_ | 1784801848499109888 |
---|---|
author | Shi, Xiaoshuang Xing, Fuyong Xu, Kaidi Chen, Pingjun Liang, Yun Lu, Zhiyong Guo, Zhenhua |
author_facet | Shi, Xiaoshuang Xing, Fuyong Xu, Kaidi Chen, Pingjun Liang, Yun Lu, Zhiyong Guo, Zhenhua |
author_sort | Shi, Xiaoshuang |
collection | PubMed |
description | Although deep neural networks have achieved great success on numerous large-scale tasks, poor interpretability is still a notorious obstacle for practical applications. In this paper, we propose a novel and general attention mechanism, loss-based attention, upon which we modify deep neural networks to mine significant image patches for explaining which parts determine the image decision-making. This is inspired by the fact that some patches contain significant objects or their parts for image-level decision. Unlike previous attention mechanisms that adopt different layers and parameters to learn weights and image prediction, the proposed loss-based attention mechanism mines significant patches by utilizing the same parameters to learn patch weights and logits (class vectors), and image prediction simultaneously, so as to connect the attention mechanism with the loss function for boosting the patch precision and recall. Additionally, different from previous popular networks that utilize max-pooling or stride operations in convolutional layers without considering the spatial relationship of features, the modified deep architectures first remove them to preserve the spatial relationship of image patches and greatly reduce their dependencies, and then add two convolutional or capsule layers to extract their features. With the learned patch weights, the image-level decision of the modified deep architectures is the weighted sum on patches. Extensive experiments on large-scale benchmark databases demonstrate that the proposed architectures can obtain better or competitive performance to state-of-the-art baseline networks with better interpretability. |
format | Online Article Text |
id | pubmed-9531187 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
record_format | MEDLINE/PubMed |
spelling | pubmed-95311872022-10-04 Loss-Based Attention for Interpreting Image-Level Prediction of Convolutional Neural Networks Shi, Xiaoshuang Xing, Fuyong Xu, Kaidi Chen, Pingjun Liang, Yun Lu, Zhiyong Guo, Zhenhua IEEE Trans Image Process Article Although deep neural networks have achieved great success on numerous large-scale tasks, poor interpretability is still a notorious obstacle for practical applications. In this paper, we propose a novel and general attention mechanism, loss-based attention, upon which we modify deep neural networks to mine significant image patches for explaining which parts determine the image decision-making. This is inspired by the fact that some patches contain significant objects or their parts for image-level decision. Unlike previous attention mechanisms that adopt different layers and parameters to learn weights and image prediction, the proposed loss-based attention mechanism mines significant patches by utilizing the same parameters to learn patch weights and logits (class vectors), and image prediction simultaneously, so as to connect the attention mechanism with the loss function for boosting the patch precision and recall. Additionally, different from previous popular networks that utilize max-pooling or stride operations in convolutional layers without considering the spatial relationship of features, the modified deep architectures first remove them to preserve the spatial relationship of image patches and greatly reduce their dependencies, and then add two convolutional or capsule layers to extract their features. With the learned patch weights, the image-level decision of the modified deep architectures is the weighted sum on patches. Extensive experiments on large-scale benchmark databases demonstrate that the proposed architectures can obtain better or competitive performance to state-of-the-art baseline networks with better interpretability. 2021 2021-01-11 /pmc/articles/PMC9531187/ /pubmed/33382655 http://dx.doi.org/10.1109/TIP.2020.3046875 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Article Shi, Xiaoshuang Xing, Fuyong Xu, Kaidi Chen, Pingjun Liang, Yun Lu, Zhiyong Guo, Zhenhua Loss-Based Attention for Interpreting Image-Level Prediction of Convolutional Neural Networks |
title | Loss-Based Attention for Interpreting Image-Level Prediction of Convolutional Neural Networks |
title_full | Loss-Based Attention for Interpreting Image-Level Prediction of Convolutional Neural Networks |
title_fullStr | Loss-Based Attention for Interpreting Image-Level Prediction of Convolutional Neural Networks |
title_full_unstemmed | Loss-Based Attention for Interpreting Image-Level Prediction of Convolutional Neural Networks |
title_short | Loss-Based Attention for Interpreting Image-Level Prediction of Convolutional Neural Networks |
title_sort | loss-based attention for interpreting image-level prediction of convolutional neural networks |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9531187/ https://www.ncbi.nlm.nih.gov/pubmed/33382655 http://dx.doi.org/10.1109/TIP.2020.3046875 |
work_keys_str_mv | AT shixiaoshuang lossbasedattentionforinterpretingimagelevelpredictionofconvolutionalneuralnetworks AT xingfuyong lossbasedattentionforinterpretingimagelevelpredictionofconvolutionalneuralnetworks AT xukaidi lossbasedattentionforinterpretingimagelevelpredictionofconvolutionalneuralnetworks AT chenpingjun lossbasedattentionforinterpretingimagelevelpredictionofconvolutionalneuralnetworks AT liangyun lossbasedattentionforinterpretingimagelevelpredictionofconvolutionalneuralnetworks AT luzhiyong lossbasedattentionforinterpretingimagelevelpredictionofconvolutionalneuralnetworks AT guozhenhua lossbasedattentionforinterpretingimagelevelpredictionofconvolutionalneuralnetworks |