Cargando…

Predictive representations can link model-based reinforcement learning to model-free mechanisms

Humans and animals are capable of evaluating actions by considering their long-run future rewards through a process described using model-based reinforcement learning (RL) algorithms. The mechanisms by which neural circuits perform the computations prescribed by model-based RL remain largely unknown...

Descripción completa

Detalles Bibliográficos
Autores principales:	Russek, Evan M., Momennejad, Ida, Botvinick, Matthew M., Gershman, Samuel J., Daw, Nathaniel D.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2017
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5628940/ https://www.ncbi.nlm.nih.gov/pubmed/28945743 http://dx.doi.org/10.1371/journal.pcbi.1005768

_version_	1783268969775366144
author	Russek, Evan M. Momennejad, Ida Botvinick, Matthew M. Gershman, Samuel J. Daw, Nathaniel D.
author_facet	Russek, Evan M. Momennejad, Ida Botvinick, Matthew M. Gershman, Samuel J. Daw, Nathaniel D.
author_sort	Russek, Evan M.
collection	PubMed
description	Humans and animals are capable of evaluating actions by considering their long-run future rewards through a process described using model-based reinforcement learning (RL) algorithms. The mechanisms by which neural circuits perform the computations prescribed by model-based RL remain largely unknown; however, multiple lines of evidence suggest that neural circuits supporting model-based behavior are structurally homologous to and overlapping with those thought to carry out model-free temporal difference (TD) learning. Here, we lay out a family of approaches by which model-based computation may be built upon a core of TD learning. The foundation of this framework is the successor representation, a predictive state representation that, when combined with TD learning of value predictions, can produce a subset of the behaviors associated with model-based learning, while requiring less decision-time computation than dynamic programming. Using simulations, we delineate the precise behavioral capabilities enabled by evaluating actions using this approach, and compare them to those demonstrated by biological organisms. We then introduce two new algorithms that build upon the successor representation while progressively mitigating its limitations. Because this framework can account for the full range of observed putatively model-based behaviors while still utilizing a core TD framework, we suggest that it represents a neurally plausible family of mechanisms for model-based evaluation.
format	Online Article Text
id	pubmed-5628940
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-56289402017-10-20 Predictive representations can link model-based reinforcement learning to model-free mechanisms Russek, Evan M. Momennejad, Ida Botvinick, Matthew M. Gershman, Samuel J. Daw, Nathaniel D. PLoS Comput Biol Research Article Humans and animals are capable of evaluating actions by considering their long-run future rewards through a process described using model-based reinforcement learning (RL) algorithms. The mechanisms by which neural circuits perform the computations prescribed by model-based RL remain largely unknown; however, multiple lines of evidence suggest that neural circuits supporting model-based behavior are structurally homologous to and overlapping with those thought to carry out model-free temporal difference (TD) learning. Here, we lay out a family of approaches by which model-based computation may be built upon a core of TD learning. The foundation of this framework is the successor representation, a predictive state representation that, when combined with TD learning of value predictions, can produce a subset of the behaviors associated with model-based learning, while requiring less decision-time computation than dynamic programming. Using simulations, we delineate the precise behavioral capabilities enabled by evaluating actions using this approach, and compare them to those demonstrated by biological organisms. We then introduce two new algorithms that build upon the successor representation while progressively mitigating its limitations. Because this framework can account for the full range of observed putatively model-based behaviors while still utilizing a core TD framework, we suggest that it represents a neurally plausible family of mechanisms for model-based evaluation. Public Library of Science 2017-09-25 /pmc/articles/PMC5628940/ /pubmed/28945743 http://dx.doi.org/10.1371/journal.pcbi.1005768 Text en © 2017 Russek et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Russek, Evan M. Momennejad, Ida Botvinick, Matthew M. Gershman, Samuel J. Daw, Nathaniel D. Predictive representations can link model-based reinforcement learning to model-free mechanisms
title	Predictive representations can link model-based reinforcement learning to model-free mechanisms
title_full	Predictive representations can link model-based reinforcement learning to model-free mechanisms
title_fullStr	Predictive representations can link model-based reinforcement learning to model-free mechanisms
title_full_unstemmed	Predictive representations can link model-based reinforcement learning to model-free mechanisms
title_short	Predictive representations can link model-based reinforcement learning to model-free mechanisms
title_sort	predictive representations can link model-based reinforcement learning to model-free mechanisms
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5628940/ https://www.ncbi.nlm.nih.gov/pubmed/28945743 http://dx.doi.org/10.1371/journal.pcbi.1005768
work_keys_str_mv	AT russekevanm predictiverepresentationscanlinkmodelbasedreinforcementlearningtomodelfreemechanisms AT momennejadida predictiverepresentationscanlinkmodelbasedreinforcementlearningtomodelfreemechanisms AT botvinickmatthewm predictiverepresentationscanlinkmodelbasedreinforcementlearningtomodelfreemechanisms AT gershmansamuelj predictiverepresentationscanlinkmodelbasedreinforcementlearningtomodelfreemechanisms AT dawnathanield predictiverepresentationscanlinkmodelbasedreinforcementlearningtomodelfreemechanisms

Predictive representations can link model-based reinforcement learning to model-free mechanisms

Ejemplares similares