Cargando…
Exploratory State Representation Learning
Not having access to compact and meaningful representations is known to significantly increase the complexity of reinforcement learning (RL). For this reason, it can be useful to perform state representation learning (SRL) before tackling RL tasks. However, obtaining a good state representation can...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8883277/ https://www.ncbi.nlm.nih.gov/pubmed/35237669 http://dx.doi.org/10.3389/frobt.2022.762051 |
_version_ | 1784659889319051264 |
---|---|
author | Merckling, Astrid Perrin-Gilbert, Nicolas Coninx, Alex Doncieux, Stéphane |
author_facet | Merckling, Astrid Perrin-Gilbert, Nicolas Coninx, Alex Doncieux, Stéphane |
author_sort | Merckling, Astrid |
collection | PubMed |
description | Not having access to compact and meaningful representations is known to significantly increase the complexity of reinforcement learning (RL). For this reason, it can be useful to perform state representation learning (SRL) before tackling RL tasks. However, obtaining a good state representation can only be done if a large diversity of transitions is observed, which can require a difficult exploration, especially if the environment is initially reward-free. To solve the problems of exploration and SRL in parallel, we propose a new approach called XSRL (eXploratory State Representation Learning). On one hand, it jointly learns compact state representations and a state transition estimator which is used to remove unexploitable information from the representations. On the other hand, it continuously trains an inverse model, and adds to the prediction error of this model a k-step learning progress bonus to form the maximization objective of a discovery policy. This results in a policy that seeks complex transitions from which the trained models can effectively learn. Our experimental results show that the approach leads to efficient exploration in challenging environments with image observations, and to state representations that significantly accelerate learning in RL tasks. |
format | Online Article Text |
id | pubmed-8883277 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-88832772022-03-01 Exploratory State Representation Learning Merckling, Astrid Perrin-Gilbert, Nicolas Coninx, Alex Doncieux, Stéphane Front Robot AI Robotics and AI Not having access to compact and meaningful representations is known to significantly increase the complexity of reinforcement learning (RL). For this reason, it can be useful to perform state representation learning (SRL) before tackling RL tasks. However, obtaining a good state representation can only be done if a large diversity of transitions is observed, which can require a difficult exploration, especially if the environment is initially reward-free. To solve the problems of exploration and SRL in parallel, we propose a new approach called XSRL (eXploratory State Representation Learning). On one hand, it jointly learns compact state representations and a state transition estimator which is used to remove unexploitable information from the representations. On the other hand, it continuously trains an inverse model, and adds to the prediction error of this model a k-step learning progress bonus to form the maximization objective of a discovery policy. This results in a policy that seeks complex transitions from which the trained models can effectively learn. Our experimental results show that the approach leads to efficient exploration in challenging environments with image observations, and to state representations that significantly accelerate learning in RL tasks. Frontiers Media S.A. 2022-02-14 /pmc/articles/PMC8883277/ /pubmed/35237669 http://dx.doi.org/10.3389/frobt.2022.762051 Text en Copyright © 2022 Merckling, Perrin-Gilbert, Coninx and Doncieux. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Robotics and AI Merckling, Astrid Perrin-Gilbert, Nicolas Coninx, Alex Doncieux, Stéphane Exploratory State Representation Learning |
title | Exploratory State Representation Learning |
title_full | Exploratory State Representation Learning |
title_fullStr | Exploratory State Representation Learning |
title_full_unstemmed | Exploratory State Representation Learning |
title_short | Exploratory State Representation Learning |
title_sort | exploratory state representation learning |
topic | Robotics and AI |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8883277/ https://www.ncbi.nlm.nih.gov/pubmed/35237669 http://dx.doi.org/10.3389/frobt.2022.762051 |
work_keys_str_mv | AT mercklingastrid exploratorystaterepresentationlearning AT perringilbertnicolas exploratorystaterepresentationlearning AT coninxalex exploratorystaterepresentationlearning AT doncieuxstephane exploratorystaterepresentationlearning |