Cargando…

Selective particle attention: Rapidly and flexibly selecting features for deep reinforcement learning

Deep Reinforcement Learning (RL) is often criticised for being data inefficient and inflexible to changes in task structure. Part of the reason for these issues is that Deep RL typically learns end-to-end using backpropagation, which results in task-specific representations. One approach for circumv...

Descripción completa

Detalles Bibliográficos
Autores principales:	Blakeman, Sam, Mareschal, Denis
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Pergamon Press 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9037388/ https://www.ncbi.nlm.nih.gov/pubmed/35358888 http://dx.doi.org/10.1016/j.neunet.2022.03.015

_version_	1784693726558289920
author	Blakeman, Sam Mareschal, Denis
author_facet	Blakeman, Sam Mareschal, Denis
author_sort	Blakeman, Sam
collection	PubMed
description	Deep Reinforcement Learning (RL) is often criticised for being data inefficient and inflexible to changes in task structure. Part of the reason for these issues is that Deep RL typically learns end-to-end using backpropagation, which results in task-specific representations. One approach for circumventing these problems is to apply Deep RL to existing representations that have been learned in a more task-agnostic fashion. However, this only partially solves the problem as the Deep RL algorithm learns a function of all pre-existing representations and is therefore still susceptible to data inefficiency and a lack of flexibility. Biological agents appear to solve this problem by forming internal representations over many tasks and only selecting a subset of these features for decision-making based on the task at hand; a process commonly referred to as selective attention. We take inspiration from selective attention in biological agents and propose a novel algorithm called Selective Particle Attention (SPA), which selects subsets of existing representations for Deep RL. Crucially, these subsets are not learned through backpropagation, which is slow and prone to overfitting, but instead via a particle filter that rapidly and flexibly identifies key subsets of features using only reward feedback. We evaluate SPA on two tasks that involve raw pixel input and dynamic changes to the task structure, and show that it greatly increases the efficiency and flexibility of downstream Deep RL algorithms.
format	Online Article Text
id	pubmed-9037388
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Pergamon Press
record_format	MEDLINE/PubMed
spelling	pubmed-90373882022-06-01 Selective particle attention: Rapidly and flexibly selecting features for deep reinforcement learning Blakeman, Sam Mareschal, Denis Neural Netw Article Deep Reinforcement Learning (RL) is often criticised for being data inefficient and inflexible to changes in task structure. Part of the reason for these issues is that Deep RL typically learns end-to-end using backpropagation, which results in task-specific representations. One approach for circumventing these problems is to apply Deep RL to existing representations that have been learned in a more task-agnostic fashion. However, this only partially solves the problem as the Deep RL algorithm learns a function of all pre-existing representations and is therefore still susceptible to data inefficiency and a lack of flexibility. Biological agents appear to solve this problem by forming internal representations over many tasks and only selecting a subset of these features for decision-making based on the task at hand; a process commonly referred to as selective attention. We take inspiration from selective attention in biological agents and propose a novel algorithm called Selective Particle Attention (SPA), which selects subsets of existing representations for Deep RL. Crucially, these subsets are not learned through backpropagation, which is slow and prone to overfitting, but instead via a particle filter that rapidly and flexibly identifies key subsets of features using only reward feedback. We evaluate SPA on two tasks that involve raw pixel input and dynamic changes to the task structure, and show that it greatly increases the efficiency and flexibility of downstream Deep RL algorithms. Pergamon Press 2022-06 /pmc/articles/PMC9037388/ /pubmed/35358888 http://dx.doi.org/10.1016/j.neunet.2022.03.015 Text en © 2022 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Blakeman, Sam Mareschal, Denis Selective particle attention: Rapidly and flexibly selecting features for deep reinforcement learning
title	Selective particle attention: Rapidly and flexibly selecting features for deep reinforcement learning
title_full	Selective particle attention: Rapidly and flexibly selecting features for deep reinforcement learning
title_fullStr	Selective particle attention: Rapidly and flexibly selecting features for deep reinforcement learning
title_full_unstemmed	Selective particle attention: Rapidly and flexibly selecting features for deep reinforcement learning
title_short	Selective particle attention: Rapidly and flexibly selecting features for deep reinforcement learning
title_sort	selective particle attention: rapidly and flexibly selecting features for deep reinforcement learning
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9037388/ https://www.ncbi.nlm.nih.gov/pubmed/35358888 http://dx.doi.org/10.1016/j.neunet.2022.03.015
work_keys_str_mv	AT blakemansam selectiveparticleattentionrapidlyandflexiblyselectingfeaturesfordeepreinforcementlearning AT mareschaldenis selectiveparticleattentionrapidlyandflexiblyselectingfeaturesfordeepreinforcementlearning

Selective particle attention: Rapidly and flexibly selecting features for deep reinforcement learning

Ejemplares similares