Cargando…

Model based planners reflect on their model-free propensities

Dual-reinforcement learning theory proposes behaviour is under the tutelage of a retrospective, value-caching, model-free (MF) system and a prospective-planning, model-based (MB), system. This architecture raises a question as to the degree to which, when devising a plan, a MB controller takes accou...

Descripción completa

Detalles Bibliográficos
Autores principales:	Moran, Rani, Keramati, Mehdi, Dolan, Raymond J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2021
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7817042/ https://www.ncbi.nlm.nih.gov/pubmed/33411724 http://dx.doi.org/10.1371/journal.pcbi.1008552

_version_	1783638562088943616
author	Moran, Rani Keramati, Mehdi Dolan, Raymond J.
author_facet	Moran, Rani Keramati, Mehdi Dolan, Raymond J.
author_sort	Moran, Rani
collection	PubMed
description	Dual-reinforcement learning theory proposes behaviour is under the tutelage of a retrospective, value-caching, model-free (MF) system and a prospective-planning, model-based (MB), system. This architecture raises a question as to the degree to which, when devising a plan, a MB controller takes account of influences from its MF counterpart. We present evidence that such a sophisticated self-reflective MB planner incorporates an anticipation of the influences its own MF-proclivities exerts on the execution of its planned future actions. Using a novel bandit task, wherein subjects were periodically allowed to design their environment, we show that reward-assignments were constructed in a manner consistent with a MB system taking account of its MF propensities. Thus, in the task participants assigned higher rewards to bandits that were momentarily associated with stronger MF tendencies. Our findings have implications for a range of decision making domains that includes drug abuse, pre-commitment, and the tension between short and long-term decision horizons in economics.
format	Online Article Text
id	pubmed-7817042
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-78170422021-01-28 Model based planners reflect on their model-free propensities Moran, Rani Keramati, Mehdi Dolan, Raymond J. PLoS Comput Biol Research Article Dual-reinforcement learning theory proposes behaviour is under the tutelage of a retrospective, value-caching, model-free (MF) system and a prospective-planning, model-based (MB), system. This architecture raises a question as to the degree to which, when devising a plan, a MB controller takes account of influences from its MF counterpart. We present evidence that such a sophisticated self-reflective MB planner incorporates an anticipation of the influences its own MF-proclivities exerts on the execution of its planned future actions. Using a novel bandit task, wherein subjects were periodically allowed to design their environment, we show that reward-assignments were constructed in a manner consistent with a MB system taking account of its MF propensities. Thus, in the task participants assigned higher rewards to bandits that were momentarily associated with stronger MF tendencies. Our findings have implications for a range of decision making domains that includes drug abuse, pre-commitment, and the tension between short and long-term decision horizons in economics. Public Library of Science 2021-01-07 /pmc/articles/PMC7817042/ /pubmed/33411724 http://dx.doi.org/10.1371/journal.pcbi.1008552 Text en © 2021 Moran et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Moran, Rani Keramati, Mehdi Dolan, Raymond J. Model based planners reflect on their model-free propensities
title	Model based planners reflect on their model-free propensities
title_full	Model based planners reflect on their model-free propensities
title_fullStr	Model based planners reflect on their model-free propensities
title_full_unstemmed	Model based planners reflect on their model-free propensities
title_short	Model based planners reflect on their model-free propensities
title_sort	model based planners reflect on their model-free propensities
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7817042/ https://www.ncbi.nlm.nih.gov/pubmed/33411724 http://dx.doi.org/10.1371/journal.pcbi.1008552
work_keys_str_mv	AT moranrani modelbasedplannersreflectontheirmodelfreepropensities AT keramatimehdi modelbasedplannersreflectontheirmodelfreepropensities AT dolanraymondj modelbasedplannersreflectontheirmodelfreepropensities

Model based planners reflect on their model-free propensities

Ejemplares similares