Cargando…

Policy search with rare significant events: Choosing the right partner to cooperate with

This paper focuses on a class of reinforcement learning problems where significant events are rare and limited to a single positive reward per episode. A typical example is that of an agent who has to choose a partner to cooperate with, while a large number of partners are simply not interested in c...

Descripción completa

Detalles Bibliográficos
Autores principales: Ecoffet, Paul, Fontbonne, Nicolas, André, Jean-Baptiste, Bredeche, Nicolas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9041856/
https://www.ncbi.nlm.nih.gov/pubmed/35472212
http://dx.doi.org/10.1371/journal.pone.0266841
_version_ 1784694584511561728
author Ecoffet, Paul
Fontbonne, Nicolas
André, Jean-Baptiste
Bredeche, Nicolas
author_facet Ecoffet, Paul
Fontbonne, Nicolas
André, Jean-Baptiste
Bredeche, Nicolas
author_sort Ecoffet, Paul
collection PubMed
description This paper focuses on a class of reinforcement learning problems where significant events are rare and limited to a single positive reward per episode. A typical example is that of an agent who has to choose a partner to cooperate with, while a large number of partners are simply not interested in cooperating, regardless of what the agent has to offer. We address this problem in a continuous state and action space with two different kinds of search methods: a gradient policy search method and a direct policy search method using an evolution strategy. We show that when significant events are rare, gradient information is also scarce, making it difficult for policy gradient search methods to find an optimal policy, with or without a deep neural architecture. On the other hand, we show that direct policy search methods are invariant to the rarity of significant events, which is yet another confirmation of the unique role evolutionary algorithms has to play as a reinforcement learning method.
format Online
Article
Text
id pubmed-9041856
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-90418562022-04-27 Policy search with rare significant events: Choosing the right partner to cooperate with Ecoffet, Paul Fontbonne, Nicolas André, Jean-Baptiste Bredeche, Nicolas PLoS One Research Article This paper focuses on a class of reinforcement learning problems where significant events are rare and limited to a single positive reward per episode. A typical example is that of an agent who has to choose a partner to cooperate with, while a large number of partners are simply not interested in cooperating, regardless of what the agent has to offer. We address this problem in a continuous state and action space with two different kinds of search methods: a gradient policy search method and a direct policy search method using an evolution strategy. We show that when significant events are rare, gradient information is also scarce, making it difficult for policy gradient search methods to find an optimal policy, with or without a deep neural architecture. On the other hand, we show that direct policy search methods are invariant to the rarity of significant events, which is yet another confirmation of the unique role evolutionary algorithms has to play as a reinforcement learning method. Public Library of Science 2022-04-26 /pmc/articles/PMC9041856/ /pubmed/35472212 http://dx.doi.org/10.1371/journal.pone.0266841 Text en © 2022 Ecoffet et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Ecoffet, Paul
Fontbonne, Nicolas
André, Jean-Baptiste
Bredeche, Nicolas
Policy search with rare significant events: Choosing the right partner to cooperate with
title Policy search with rare significant events: Choosing the right partner to cooperate with
title_full Policy search with rare significant events: Choosing the right partner to cooperate with
title_fullStr Policy search with rare significant events: Choosing the right partner to cooperate with
title_full_unstemmed Policy search with rare significant events: Choosing the right partner to cooperate with
title_short Policy search with rare significant events: Choosing the right partner to cooperate with
title_sort policy search with rare significant events: choosing the right partner to cooperate with
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9041856/
https://www.ncbi.nlm.nih.gov/pubmed/35472212
http://dx.doi.org/10.1371/journal.pone.0266841
work_keys_str_mv AT ecoffetpaul policysearchwithraresignificanteventschoosingtherightpartnertocooperatewith
AT fontbonnenicolas policysearchwithraresignificanteventschoosingtherightpartnertocooperatewith
AT andrejeanbaptiste policysearchwithraresignificanteventschoosingtherightpartnertocooperatewith
AT bredechenicolas policysearchwithraresignificanteventschoosingtherightpartnertocooperatewith