Cargando…

Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task

The recently developed ‘two-step’ behavioural task promises to differentiate model-based from model-free reinforcement learning, while generating neurophysiologically-friendly decision datasets with parametric variation of decision variables. These desirable features have prompted its widespread ado...

Descripción completa

Detalles Bibliográficos
Autores principales: Akam, Thomas, Costa, Rui, Dayan, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4686094/
https://www.ncbi.nlm.nih.gov/pubmed/26657806
http://dx.doi.org/10.1371/journal.pcbi.1004648
_version_ 1782406406122504192
author Akam, Thomas
Costa, Rui
Dayan, Peter
author_facet Akam, Thomas
Costa, Rui
Dayan, Peter
author_sort Akam, Thomas
collection PubMed
description The recently developed ‘two-step’ behavioural task promises to differentiate model-based from model-free reinforcement learning, while generating neurophysiologically-friendly decision datasets with parametric variation of decision variables. These desirable features have prompted its widespread adoption. Here, we analyse the interactions between a range of different strategies and the structure of transitions and outcomes in order to examine constraints on what can be learned from behavioural performance. The task involves a trade-off between the need for stochasticity, to allow strategies to be discriminated, and a need for determinism, so that it is worth subjects’ investment of effort to exploit the contingencies optimally. We show through simulation that under certain conditions model-free strategies can masquerade as being model-based. We first show that seemingly innocuous modifications to the task structure can induce correlations between action values at the start of the trial and the subsequent trial events in such a way that analysis based on comparing successive trials can lead to erroneous conclusions. We confirm the power of a suggested correction to the analysis that can alleviate this problem. We then consider model-free reinforcement learning strategies that exploit correlations between where rewards are obtained and which actions have high expected value. These generate behaviour that appears model-based under these, and also more sophisticated, analyses. Exploiting the full potential of the two-step task as a tool for behavioural neuroscience requires an understanding of these issues.
format Online
Article
Text
id pubmed-4686094
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-46860942016-01-07 Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task Akam, Thomas Costa, Rui Dayan, Peter PLoS Comput Biol Research Article The recently developed ‘two-step’ behavioural task promises to differentiate model-based from model-free reinforcement learning, while generating neurophysiologically-friendly decision datasets with parametric variation of decision variables. These desirable features have prompted its widespread adoption. Here, we analyse the interactions between a range of different strategies and the structure of transitions and outcomes in order to examine constraints on what can be learned from behavioural performance. The task involves a trade-off between the need for stochasticity, to allow strategies to be discriminated, and a need for determinism, so that it is worth subjects’ investment of effort to exploit the contingencies optimally. We show through simulation that under certain conditions model-free strategies can masquerade as being model-based. We first show that seemingly innocuous modifications to the task structure can induce correlations between action values at the start of the trial and the subsequent trial events in such a way that analysis based on comparing successive trials can lead to erroneous conclusions. We confirm the power of a suggested correction to the analysis that can alleviate this problem. We then consider model-free reinforcement learning strategies that exploit correlations between where rewards are obtained and which actions have high expected value. These generate behaviour that appears model-based under these, and also more sophisticated, analyses. Exploiting the full potential of the two-step task as a tool for behavioural neuroscience requires an understanding of these issues. Public Library of Science 2015-12-11 /pmc/articles/PMC4686094/ /pubmed/26657806 http://dx.doi.org/10.1371/journal.pcbi.1004648 Text en © 2015 Akam et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Akam, Thomas
Costa, Rui
Dayan, Peter
Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task
title Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task
title_full Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task
title_fullStr Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task
title_full_unstemmed Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task
title_short Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task
title_sort simple plans or sophisticated habits? state, transition and learning interactions in the two-step task
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4686094/
https://www.ncbi.nlm.nih.gov/pubmed/26657806
http://dx.doi.org/10.1371/journal.pcbi.1004648
work_keys_str_mv AT akamthomas simpleplansorsophisticatedhabitsstatetransitionandlearninginteractionsinthetwosteptask
AT costarui simpleplansorsophisticatedhabitsstatetransitionandlearninginteractionsinthetwosteptask
AT dayanpeter simpleplansorsophisticatedhabitsstatetransitionandlearninginteractionsinthetwosteptask