Cargando…

Model based planners reflect on their model-free propensities

Dual-reinforcement learning theory proposes behaviour is under the tutelage of a retrospective, value-caching, model-free (MF) system and a prospective-planning, model-based (MB), system. This architecture raises a question as to the degree to which, when devising a plan, a MB controller takes accou...

Descripción completa

Detalles Bibliográficos
Autores principales: Moran, Rani, Keramati, Mehdi, Dolan, Raymond J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7817042/
https://www.ncbi.nlm.nih.gov/pubmed/33411724
http://dx.doi.org/10.1371/journal.pcbi.1008552
_version_ 1783638562088943616
author Moran, Rani
Keramati, Mehdi
Dolan, Raymond J.
author_facet Moran, Rani
Keramati, Mehdi
Dolan, Raymond J.
author_sort Moran, Rani
collection PubMed
description Dual-reinforcement learning theory proposes behaviour is under the tutelage of a retrospective, value-caching, model-free (MF) system and a prospective-planning, model-based (MB), system. This architecture raises a question as to the degree to which, when devising a plan, a MB controller takes account of influences from its MF counterpart. We present evidence that such a sophisticated self-reflective MB planner incorporates an anticipation of the influences its own MF-proclivities exerts on the execution of its planned future actions. Using a novel bandit task, wherein subjects were periodically allowed to design their environment, we show that reward-assignments were constructed in a manner consistent with a MB system taking account of its MF propensities. Thus, in the task participants assigned higher rewards to bandits that were momentarily associated with stronger MF tendencies. Our findings have implications for a range of decision making domains that includes drug abuse, pre-commitment, and the tension between short and long-term decision horizons in economics.
format Online
Article
Text
id pubmed-7817042
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-78170422021-01-28 Model based planners reflect on their model-free propensities Moran, Rani Keramati, Mehdi Dolan, Raymond J. PLoS Comput Biol Research Article Dual-reinforcement learning theory proposes behaviour is under the tutelage of a retrospective, value-caching, model-free (MF) system and a prospective-planning, model-based (MB), system. This architecture raises a question as to the degree to which, when devising a plan, a MB controller takes account of influences from its MF counterpart. We present evidence that such a sophisticated self-reflective MB planner incorporates an anticipation of the influences its own MF-proclivities exerts on the execution of its planned future actions. Using a novel bandit task, wherein subjects were periodically allowed to design their environment, we show that reward-assignments were constructed in a manner consistent with a MB system taking account of its MF propensities. Thus, in the task participants assigned higher rewards to bandits that were momentarily associated with stronger MF tendencies. Our findings have implications for a range of decision making domains that includes drug abuse, pre-commitment, and the tension between short and long-term decision horizons in economics. Public Library of Science 2021-01-07 /pmc/articles/PMC7817042/ /pubmed/33411724 http://dx.doi.org/10.1371/journal.pcbi.1008552 Text en © 2021 Moran et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Moran, Rani
Keramati, Mehdi
Dolan, Raymond J.
Model based planners reflect on their model-free propensities
title Model based planners reflect on their model-free propensities
title_full Model based planners reflect on their model-free propensities
title_fullStr Model based planners reflect on their model-free propensities
title_full_unstemmed Model based planners reflect on their model-free propensities
title_short Model based planners reflect on their model-free propensities
title_sort model based planners reflect on their model-free propensities
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7817042/
https://www.ncbi.nlm.nih.gov/pubmed/33411724
http://dx.doi.org/10.1371/journal.pcbi.1008552
work_keys_str_mv AT moranrani modelbasedplannersreflectontheirmodelfreepropensities
AT keramatimehdi modelbasedplannersreflectontheirmodelfreepropensities
AT dolanraymondj modelbasedplannersreflectontheirmodelfreepropensities