Cargando…

Explaining Neural Networks Using Attentive Knowledge Distillation

Explaining the prediction of deep neural networks makes the networks more understandable and trusted, leading to their use in various mission critical tasks. Recent progress in the learning capability of networks has primarily been due to the enormous number of model parameters, so that it is usuall...

Descripción completa

Detalles Bibliográficos
Autores principales:	Lee, Hyeonseok, Kim, Sungchan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7916876/ https://www.ncbi.nlm.nih.gov/pubmed/33670125 http://dx.doi.org/10.3390/s21041280

_version_	1783657577566961664
author	Lee, Hyeonseok Kim, Sungchan
author_facet	Lee, Hyeonseok Kim, Sungchan
author_sort	Lee, Hyeonseok
collection	PubMed
description	Explaining the prediction of deep neural networks makes the networks more understandable and trusted, leading to their use in various mission critical tasks. Recent progress in the learning capability of networks has primarily been due to the enormous number of model parameters, so that it is usually hard to interpret their operations, as opposed to classical white-box models. For this purpose, generating saliency maps is a popular approach to identify the important input features used for the model prediction. Existing explanation methods typically only use the output of the last convolution layer of the model to generate a saliency map, lacking the information included in intermediate layers. Thus, the corresponding explanations are coarse and result in limited accuracy. Although the accuracy can be improved by iteratively developing a saliency map, this is too time-consuming and is thus impractical. To address these problems, we proposed a novel approach to explain the model prediction by developing an attentive surrogate network using the knowledge distillation. The surrogate network aims to generate a fine-grained saliency map corresponding to the model prediction using meaningful regional information presented over all network layers. Experiments demonstrated that the saliency maps are the result of spatially attentive features learned from the distillation. Thus, they are useful for fine-grained classification tasks. Moreover, the proposed method runs at the rate of 24.3 frames per second, which is much faster than the existing methods by orders of magnitude.
format	Online Article Text
id	pubmed-7916876
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-79168762021-03-01 Explaining Neural Networks Using Attentive Knowledge Distillation Lee, Hyeonseok Kim, Sungchan Sensors (Basel) Article Explaining the prediction of deep neural networks makes the networks more understandable and trusted, leading to their use in various mission critical tasks. Recent progress in the learning capability of networks has primarily been due to the enormous number of model parameters, so that it is usually hard to interpret their operations, as opposed to classical white-box models. For this purpose, generating saliency maps is a popular approach to identify the important input features used for the model prediction. Existing explanation methods typically only use the output of the last convolution layer of the model to generate a saliency map, lacking the information included in intermediate layers. Thus, the corresponding explanations are coarse and result in limited accuracy. Although the accuracy can be improved by iteratively developing a saliency map, this is too time-consuming and is thus impractical. To address these problems, we proposed a novel approach to explain the model prediction by developing an attentive surrogate network using the knowledge distillation. The surrogate network aims to generate a fine-grained saliency map corresponding to the model prediction using meaningful regional information presented over all network layers. Experiments demonstrated that the saliency maps are the result of spatially attentive features learned from the distillation. Thus, they are useful for fine-grained classification tasks. Moreover, the proposed method runs at the rate of 24.3 frames per second, which is much faster than the existing methods by orders of magnitude. MDPI 2021-02-11 /pmc/articles/PMC7916876/ /pubmed/33670125 http://dx.doi.org/10.3390/s21041280 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Lee, Hyeonseok Kim, Sungchan Explaining Neural Networks Using Attentive Knowledge Distillation
title	Explaining Neural Networks Using Attentive Knowledge Distillation
title_full	Explaining Neural Networks Using Attentive Knowledge Distillation
title_fullStr	Explaining Neural Networks Using Attentive Knowledge Distillation
title_full_unstemmed	Explaining Neural Networks Using Attentive Knowledge Distillation
title_short	Explaining Neural Networks Using Attentive Knowledge Distillation
title_sort	explaining neural networks using attentive knowledge distillation
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7916876/ https://www.ncbi.nlm.nih.gov/pubmed/33670125 http://dx.doi.org/10.3390/s21041280
work_keys_str_mv	AT leehyeonseok explainingneuralnetworksusingattentiveknowledgedistillation AT kimsungchan explainingneuralnetworksusingattentiveknowledgedistillation

Explaining Neural Networks Using Attentive Knowledge Distillation

Ejemplares similares