Cargando…

Playing Atari with few neurons: Improving the efficacy of reinforcement learning by decoupling feature extraction and decision making

We propose a new method for learning compact state representations and policies separately but simultaneously for policy approximation in vision-based applications such as Atari games. Approaches based on deep reinforcement learning typically map pixels directly to actions to enable end-to-end train...

Descripción completa

Detalles Bibliográficos
Autores principales:	Cuccu, Giuseppe, Togelius, Julian, Cudré-Mauroux, Philippe
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer US 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8550197/ https://www.ncbi.nlm.nih.gov/pubmed/34720684 http://dx.doi.org/10.1007/s10458-021-09497-8

_version_	1784590909513400320
author	Cuccu, Giuseppe Togelius, Julian Cudré-Mauroux, Philippe
author_facet	Cuccu, Giuseppe Togelius, Julian Cudré-Mauroux, Philippe
author_sort	Cuccu, Giuseppe
collection	PubMed
description	We propose a new method for learning compact state representations and policies separately but simultaneously for policy approximation in vision-based applications such as Atari games. Approaches based on deep reinforcement learning typically map pixels directly to actions to enable end-to-end training. Internally, however, the deep neural network bears the responsibility of both extracting useful information and making decisions based on it, two objectives which can be addressed independently. Separating the image processing from the action selection allows for a better understanding of either task individually, as well as potentially finding smaller policy representations which is inherently interesting. Our approach learns state representations using a compact encoder based on two novel algorithms: (i) Increasing Dictionary Vector Quantization builds a dictionary of state representations which grows in size over time, allowing our method to address new observations as they appear in an open-ended online-learning context; and (ii) Direct Residuals Sparse Coding encodes observations in function of the dictionary, aiming for highest information inclusion by disregarding reconstruction error and maximizing code sparsity. As the dictionary size increases, however, the encoder produces increasingly larger inputs for the neural network; this issue is addressed with a new variant of the Exponential Natural Evolution Strategies algorithm which adapts the dimensionality of its probability distribution along the run. We test our system on a selection of Atari games using tiny neural networks of only 6 to 18 neurons (depending on each game’s controls). These are still capable of achieving results that are not much worse, and occasionally superior, to the state-of-the-art in direct policy search which uses two orders of magnitude more neurons.
format	Online Article Text
id	pubmed-8550197
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Springer US
record_format	MEDLINE/PubMed
spelling	pubmed-85501972021-10-29 Playing Atari with few neurons: Improving the efficacy of reinforcement learning by decoupling feature extraction and decision making Cuccu, Giuseppe Togelius, Julian Cudré-Mauroux, Philippe Auton Agent Multi Agent Syst Article We propose a new method for learning compact state representations and policies separately but simultaneously for policy approximation in vision-based applications such as Atari games. Approaches based on deep reinforcement learning typically map pixels directly to actions to enable end-to-end training. Internally, however, the deep neural network bears the responsibility of both extracting useful information and making decisions based on it, two objectives which can be addressed independently. Separating the image processing from the action selection allows for a better understanding of either task individually, as well as potentially finding smaller policy representations which is inherently interesting. Our approach learns state representations using a compact encoder based on two novel algorithms: (i) Increasing Dictionary Vector Quantization builds a dictionary of state representations which grows in size over time, allowing our method to address new observations as they appear in an open-ended online-learning context; and (ii) Direct Residuals Sparse Coding encodes observations in function of the dictionary, aiming for highest information inclusion by disregarding reconstruction error and maximizing code sparsity. As the dictionary size increases, however, the encoder produces increasingly larger inputs for the neural network; this issue is addressed with a new variant of the Exponential Natural Evolution Strategies algorithm which adapts the dimensionality of its probability distribution along the run. We test our system on a selection of Atari games using tiny neural networks of only 6 to 18 neurons (depending on each game’s controls). These are still capable of achieving results that are not much worse, and occasionally superior, to the state-of-the-art in direct policy search which uses two orders of magnitude more neurons. Springer US 2021-04-19 2021 /pmc/articles/PMC8550197/ /pubmed/34720684 http://dx.doi.org/10.1007/s10458-021-09497-8 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Article Cuccu, Giuseppe Togelius, Julian Cudré-Mauroux, Philippe Playing Atari with few neurons: Improving the efficacy of reinforcement learning by decoupling feature extraction and decision making
title	Playing Atari with few neurons: Improving the efficacy of reinforcement learning by decoupling feature extraction and decision making
title_full	Playing Atari with few neurons: Improving the efficacy of reinforcement learning by decoupling feature extraction and decision making
title_fullStr	Playing Atari with few neurons: Improving the efficacy of reinforcement learning by decoupling feature extraction and decision making
title_full_unstemmed	Playing Atari with few neurons: Improving the efficacy of reinforcement learning by decoupling feature extraction and decision making
title_short	Playing Atari with few neurons: Improving the efficacy of reinforcement learning by decoupling feature extraction and decision making
title_sort	playing atari with few neurons: improving the efficacy of reinforcement learning by decoupling feature extraction and decision making
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8550197/ https://www.ncbi.nlm.nih.gov/pubmed/34720684 http://dx.doi.org/10.1007/s10458-021-09497-8
work_keys_str_mv	AT cuccugiuseppe playingatariwithfewneuronsimprovingtheefficacyofreinforcementlearningbydecouplingfeatureextractionanddecisionmaking AT togeliusjulian playingatariwithfewneuronsimprovingtheefficacyofreinforcementlearningbydecouplingfeatureextractionanddecisionmaking AT cudremaurouxphilippe playingatariwithfewneuronsimprovingtheefficacyofreinforcementlearningbydecouplingfeatureextractionanddecisionmaking

Playing Atari with few neurons: Improving the efficacy of reinforcement learning by decoupling feature extraction and decision making

Ejemplares similares