Cargando…
Model based planners reflect on their model-free propensities
Dual-reinforcement learning theory proposes behaviour is under the tutelage of a retrospective, value-caching, model-free (MF) system and a prospective-planning, model-based (MB), system. This architecture raises a question as to the degree to which, when devising a plan, a MB controller takes accou...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7817042/ https://www.ncbi.nlm.nih.gov/pubmed/33411724 http://dx.doi.org/10.1371/journal.pcbi.1008552 |
_version_ | 1783638562088943616 |
---|---|
author | Moran, Rani Keramati, Mehdi Dolan, Raymond J. |
author_facet | Moran, Rani Keramati, Mehdi Dolan, Raymond J. |
author_sort | Moran, Rani |
collection | PubMed |
description | Dual-reinforcement learning theory proposes behaviour is under the tutelage of a retrospective, value-caching, model-free (MF) system and a prospective-planning, model-based (MB), system. This architecture raises a question as to the degree to which, when devising a plan, a MB controller takes account of influences from its MF counterpart. We present evidence that such a sophisticated self-reflective MB planner incorporates an anticipation of the influences its own MF-proclivities exerts on the execution of its planned future actions. Using a novel bandit task, wherein subjects were periodically allowed to design their environment, we show that reward-assignments were constructed in a manner consistent with a MB system taking account of its MF propensities. Thus, in the task participants assigned higher rewards to bandits that were momentarily associated with stronger MF tendencies. Our findings have implications for a range of decision making domains that includes drug abuse, pre-commitment, and the tension between short and long-term decision horizons in economics. |
format | Online Article Text |
id | pubmed-7817042 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-78170422021-01-28 Model based planners reflect on their model-free propensities Moran, Rani Keramati, Mehdi Dolan, Raymond J. PLoS Comput Biol Research Article Dual-reinforcement learning theory proposes behaviour is under the tutelage of a retrospective, value-caching, model-free (MF) system and a prospective-planning, model-based (MB), system. This architecture raises a question as to the degree to which, when devising a plan, a MB controller takes account of influences from its MF counterpart. We present evidence that such a sophisticated self-reflective MB planner incorporates an anticipation of the influences its own MF-proclivities exerts on the execution of its planned future actions. Using a novel bandit task, wherein subjects were periodically allowed to design their environment, we show that reward-assignments were constructed in a manner consistent with a MB system taking account of its MF propensities. Thus, in the task participants assigned higher rewards to bandits that were momentarily associated with stronger MF tendencies. Our findings have implications for a range of decision making domains that includes drug abuse, pre-commitment, and the tension between short and long-term decision horizons in economics. Public Library of Science 2021-01-07 /pmc/articles/PMC7817042/ /pubmed/33411724 http://dx.doi.org/10.1371/journal.pcbi.1008552 Text en © 2021 Moran et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Moran, Rani Keramati, Mehdi Dolan, Raymond J. Model based planners reflect on their model-free propensities |
title | Model based planners reflect on their model-free propensities |
title_full | Model based planners reflect on their model-free propensities |
title_fullStr | Model based planners reflect on their model-free propensities |
title_full_unstemmed | Model based planners reflect on their model-free propensities |
title_short | Model based planners reflect on their model-free propensities |
title_sort | model based planners reflect on their model-free propensities |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7817042/ https://www.ncbi.nlm.nih.gov/pubmed/33411724 http://dx.doi.org/10.1371/journal.pcbi.1008552 |
work_keys_str_mv | AT moranrani modelbasedplannersreflectontheirmodelfreepropensities AT keramatimehdi modelbasedplannersreflectontheirmodelfreepropensities AT dolanraymondj modelbasedplannersreflectontheirmodelfreepropensities |