Cargando…

When Does Model-Based Control Pay Off?

Many accounts of decision making and reinforcement learning posit the existence of two distinct systems that control choice: a fast, automatic system and a slow, deliberative system. Recent research formalizes this distinction by mapping these systems to “model-free” and “model-based” strategies in...

Descripción completa

Detalles Bibliográficos
Autores principales: Kool, Wouter, Cushman, Fiery A., Gershman, Samuel J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001643/
https://www.ncbi.nlm.nih.gov/pubmed/27564094
http://dx.doi.org/10.1371/journal.pcbi.1005090
_version_ 1782450457966280704
author Kool, Wouter
Cushman, Fiery A.
Gershman, Samuel J.
author_facet Kool, Wouter
Cushman, Fiery A.
Gershman, Samuel J.
author_sort Kool, Wouter
collection PubMed
description Many accounts of decision making and reinforcement learning posit the existence of two distinct systems that control choice: a fast, automatic system and a slow, deliberative system. Recent research formalizes this distinction by mapping these systems to “model-free” and “model-based” strategies in reinforcement learning. Model-free strategies are computationally cheap, but sometimes inaccurate, because action values can be accessed by inspecting a look-up table constructed through trial-and-error. In contrast, model-based strategies compute action values through planning in a causal model of the environment, which is more accurate but also more cognitively demanding. It is assumed that this trade-off between accuracy and computational demand plays an important role in the arbitration between the two strategies, but we show that the hallmark task for dissociating model-free and model-based strategies, as well as several related variants, do not embody such a trade-off. We describe five factors that reduce the effectiveness of the model-based strategy on these tasks by reducing its accuracy in estimating reward outcomes and decreasing the importance of its choices. Based on these observations, we describe a version of the task that formally and empirically obtains an accuracy-demand trade-off between model-free and model-based strategies. Moreover, we show that human participants spontaneously increase their reliance on model-based control on this task, compared to the original paradigm. Our novel task and our computational analyses may prove important in subsequent empirical investigations of how humans balance accuracy and demand.
format Online
Article
Text
id pubmed-5001643
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-50016432016-09-12 When Does Model-Based Control Pay Off? Kool, Wouter Cushman, Fiery A. Gershman, Samuel J. PLoS Comput Biol Research Article Many accounts of decision making and reinforcement learning posit the existence of two distinct systems that control choice: a fast, automatic system and a slow, deliberative system. Recent research formalizes this distinction by mapping these systems to “model-free” and “model-based” strategies in reinforcement learning. Model-free strategies are computationally cheap, but sometimes inaccurate, because action values can be accessed by inspecting a look-up table constructed through trial-and-error. In contrast, model-based strategies compute action values through planning in a causal model of the environment, which is more accurate but also more cognitively demanding. It is assumed that this trade-off between accuracy and computational demand plays an important role in the arbitration between the two strategies, but we show that the hallmark task for dissociating model-free and model-based strategies, as well as several related variants, do not embody such a trade-off. We describe five factors that reduce the effectiveness of the model-based strategy on these tasks by reducing its accuracy in estimating reward outcomes and decreasing the importance of its choices. Based on these observations, we describe a version of the task that formally and empirically obtains an accuracy-demand trade-off between model-free and model-based strategies. Moreover, we show that human participants spontaneously increase their reliance on model-based control on this task, compared to the original paradigm. Our novel task and our computational analyses may prove important in subsequent empirical investigations of how humans balance accuracy and demand. Public Library of Science 2016-08-26 /pmc/articles/PMC5001643/ /pubmed/27564094 http://dx.doi.org/10.1371/journal.pcbi.1005090 Text en © 2016 Kool et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Kool, Wouter
Cushman, Fiery A.
Gershman, Samuel J.
When Does Model-Based Control Pay Off?
title When Does Model-Based Control Pay Off?
title_full When Does Model-Based Control Pay Off?
title_fullStr When Does Model-Based Control Pay Off?
title_full_unstemmed When Does Model-Based Control Pay Off?
title_short When Does Model-Based Control Pay Off?
title_sort when does model-based control pay off?
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001643/
https://www.ncbi.nlm.nih.gov/pubmed/27564094
http://dx.doi.org/10.1371/journal.pcbi.1005090
work_keys_str_mv AT koolwouter whendoesmodelbasedcontrolpayoff
AT cushmanfierya whendoesmodelbasedcontrolpayoff
AT gershmansamuelj whendoesmodelbasedcontrolpayoff