Cargando…

Emergence of belief-like representations through reinforcement learning

To behave adaptively, animals must learn to predict future reward, or value. To do this, animals are thought to learn reward predictions using reinforcement learning. However, in contrast to classical models, animals must learn to estimate value using only incomplete state information. Previous work...

Descripción completa

Detalles Bibliográficos
Autores principales: Hennig, Jay A., Pinto, Sandra A. Romero, Yamaguchi, Takahiro, Linderman, Scott W., Uchida, Naoshige, Gershman, Samuel J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10104054/
https://www.ncbi.nlm.nih.gov/pubmed/37066383
http://dx.doi.org/10.1101/2023.04.04.535512
_version_ 1785025960887713792
author Hennig, Jay A.
Pinto, Sandra A. Romero
Yamaguchi, Takahiro
Linderman, Scott W.
Uchida, Naoshige
Gershman, Samuel J.
author_facet Hennig, Jay A.
Pinto, Sandra A. Romero
Yamaguchi, Takahiro
Linderman, Scott W.
Uchida, Naoshige
Gershman, Samuel J.
author_sort Hennig, Jay A.
collection PubMed
description To behave adaptively, animals must learn to predict future reward, or value. To do this, animals are thought to learn reward predictions using reinforcement learning. However, in contrast to classical models, animals must learn to estimate value using only incomplete state information. Previous work suggests that animals estimate value in partially observable tasks by first forming “beliefs”—optimal Bayesian estimates of the hidden states in the task. Although this is one way to solve the problem of partial observability, it is not the only way, nor is it the most computationally scalable solution in complex, real-world environments. Here we show that a recurrent neural network (RNN) can learn to estimate value directly from observations, generating reward prediction errors that resemble those observed experimentally, without any explicit objective of estimating beliefs. We integrate statistical, functional, and dynamical systems perspectives on beliefs to show that the RNN’s learned representation encodes belief information, but only when the RNN’s capacity is sufficiently large. These results illustrate how animals can estimate value in tasks without explicitly estimating beliefs, yielding a representation useful for systems with limited capacity.
format Online
Article
Text
id pubmed-10104054
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-101040542023-04-15 Emergence of belief-like representations through reinforcement learning Hennig, Jay A. Pinto, Sandra A. Romero Yamaguchi, Takahiro Linderman, Scott W. Uchida, Naoshige Gershman, Samuel J. bioRxiv Article To behave adaptively, animals must learn to predict future reward, or value. To do this, animals are thought to learn reward predictions using reinforcement learning. However, in contrast to classical models, animals must learn to estimate value using only incomplete state information. Previous work suggests that animals estimate value in partially observable tasks by first forming “beliefs”—optimal Bayesian estimates of the hidden states in the task. Although this is one way to solve the problem of partial observability, it is not the only way, nor is it the most computationally scalable solution in complex, real-world environments. Here we show that a recurrent neural network (RNN) can learn to estimate value directly from observations, generating reward prediction errors that resemble those observed experimentally, without any explicit objective of estimating beliefs. We integrate statistical, functional, and dynamical systems perspectives on beliefs to show that the RNN’s learned representation encodes belief information, but only when the RNN’s capacity is sufficiently large. These results illustrate how animals can estimate value in tasks without explicitly estimating beliefs, yielding a representation useful for systems with limited capacity. Cold Spring Harbor Laboratory 2023-04-04 /pmc/articles/PMC10104054/ /pubmed/37066383 http://dx.doi.org/10.1101/2023.04.04.535512 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Hennig, Jay A.
Pinto, Sandra A. Romero
Yamaguchi, Takahiro
Linderman, Scott W.
Uchida, Naoshige
Gershman, Samuel J.
Emergence of belief-like representations through reinforcement learning
title Emergence of belief-like representations through reinforcement learning
title_full Emergence of belief-like representations through reinforcement learning
title_fullStr Emergence of belief-like representations through reinforcement learning
title_full_unstemmed Emergence of belief-like representations through reinforcement learning
title_short Emergence of belief-like representations through reinforcement learning
title_sort emergence of belief-like representations through reinforcement learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10104054/
https://www.ncbi.nlm.nih.gov/pubmed/37066383
http://dx.doi.org/10.1101/2023.04.04.535512
work_keys_str_mv AT hennigjaya emergenceofbelieflikerepresentationsthroughreinforcementlearning
AT pintosandraaromero emergenceofbelieflikerepresentationsthroughreinforcementlearning
AT yamaguchitakahiro emergenceofbelieflikerepresentationsthroughreinforcementlearning
AT lindermanscottw emergenceofbelieflikerepresentationsthroughreinforcementlearning
AT uchidanaoshige emergenceofbelieflikerepresentationsthroughreinforcementlearning
AT gershmansamuelj emergenceofbelieflikerepresentationsthroughreinforcementlearning