Cargando…

Benchmarking Perturbation-Based Saliency Maps for Explaining Atari Agents

One of the most prominent methods for explaining the behavior of Deep Reinforcement Learning (DRL) agents is the generation of saliency maps that show how much each pixel attributed to the agents' decision. However, there is no work that computationally evaluates and compares the fidelity of di...

Descripción completa

Detalles Bibliográficos
Autores principales:	Huber, Tobias, Limmer, Benedikt, André, Elisabeth
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2022
Materias:	Artificial Intelligence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9326049/ https://www.ncbi.nlm.nih.gov/pubmed/35910188 http://dx.doi.org/10.3389/frai.2022.903875

_version_	1784757189686067200
author	Huber, Tobias Limmer, Benedikt André, Elisabeth
author_facet	Huber, Tobias Limmer, Benedikt André, Elisabeth
author_sort	Huber, Tobias
collection	PubMed
description	One of the most prominent methods for explaining the behavior of Deep Reinforcement Learning (DRL) agents is the generation of saliency maps that show how much each pixel attributed to the agents' decision. However, there is no work that computationally evaluates and compares the fidelity of different perturbation-based saliency map approaches specifically for DRL agents. It is particularly challenging to computationally evaluate saliency maps for DRL agents since their decisions are part of an overarching policy, which includes long-term decision making. For instance, the output neurons of value-based DRL algorithms encode both the value of the current state as well as the expected future reward after doing each action in this state. This ambiguity should be considered when evaluating saliency maps for such agents. In this paper, we compare five popular perturbation-based approaches to create saliency maps for DRL agents trained on four different Atari 2,600 games. The approaches are compared using two computational metrics: dependence on the learned parameters of the underlying deep Q-network of the agents (sanity checks) and fidelity to the agents' reasoning (input degradation). During the sanity checks, we found that a popular noise-based saliency map approach for DRL agents shows little dependence on the parameters of the output layer. We demonstrate that this can be fixed by tweaking the algorithm such that it focuses on specific actions instead of the general entropy within the output values. For fidelity, we identify two main factors that influence which saliency map approach should be chosen in which situation. Particular to value-based DRL agents, we show that analyzing the agents' choice of action requires different saliency map approaches than analyzing the agents' state value estimation.
format	Online Article Text
id	pubmed-9326049
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-93260492022-07-28 Benchmarking Perturbation-Based Saliency Maps for Explaining Atari Agents Huber, Tobias Limmer, Benedikt André, Elisabeth Front Artif Intell Artificial Intelligence One of the most prominent methods for explaining the behavior of Deep Reinforcement Learning (DRL) agents is the generation of saliency maps that show how much each pixel attributed to the agents' decision. However, there is no work that computationally evaluates and compares the fidelity of different perturbation-based saliency map approaches specifically for DRL agents. It is particularly challenging to computationally evaluate saliency maps for DRL agents since their decisions are part of an overarching policy, which includes long-term decision making. For instance, the output neurons of value-based DRL algorithms encode both the value of the current state as well as the expected future reward after doing each action in this state. This ambiguity should be considered when evaluating saliency maps for such agents. In this paper, we compare five popular perturbation-based approaches to create saliency maps for DRL agents trained on four different Atari 2,600 games. The approaches are compared using two computational metrics: dependence on the learned parameters of the underlying deep Q-network of the agents (sanity checks) and fidelity to the agents' reasoning (input degradation). During the sanity checks, we found that a popular noise-based saliency map approach for DRL agents shows little dependence on the parameters of the output layer. We demonstrate that this can be fixed by tweaking the algorithm such that it focuses on specific actions instead of the general entropy within the output values. For fidelity, we identify two main factors that influence which saliency map approach should be chosen in which situation. Particular to value-based DRL agents, we show that analyzing the agents' choice of action requires different saliency map approaches than analyzing the agents' state value estimation. Frontiers Media S.A. 2022-07-13 /pmc/articles/PMC9326049/ /pubmed/35910188 http://dx.doi.org/10.3389/frai.2022.903875 Text en Copyright © 2022 Huber, Limmer and André. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Artificial Intelligence Huber, Tobias Limmer, Benedikt André, Elisabeth Benchmarking Perturbation-Based Saliency Maps for Explaining Atari Agents
title	Benchmarking Perturbation-Based Saliency Maps for Explaining Atari Agents
title_full	Benchmarking Perturbation-Based Saliency Maps for Explaining Atari Agents
title_fullStr	Benchmarking Perturbation-Based Saliency Maps for Explaining Atari Agents
title_full_unstemmed	Benchmarking Perturbation-Based Saliency Maps for Explaining Atari Agents
title_short	Benchmarking Perturbation-Based Saliency Maps for Explaining Atari Agents
title_sort	benchmarking perturbation-based saliency maps for explaining atari agents
topic	Artificial Intelligence
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9326049/ https://www.ncbi.nlm.nih.gov/pubmed/35910188 http://dx.doi.org/10.3389/frai.2022.903875
work_keys_str_mv	AT hubertobias benchmarkingperturbationbasedsaliencymapsforexplainingatariagents AT limmerbenedikt benchmarkingperturbationbasedsaliencymapsforexplainingatariagents AT andreelisabeth benchmarkingperturbationbasedsaliencymapsforexplainingatariagents

Benchmarking Perturbation-Based Saliency Maps for Explaining Atari Agents

Ejemplares similares