Cargando…
Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task
The recently developed ‘two-step’ behavioural task promises to differentiate model-based from model-free reinforcement learning, while generating neurophysiologically-friendly decision datasets with parametric variation of decision variables. These desirable features have prompted its widespread ado...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4686094/ https://www.ncbi.nlm.nih.gov/pubmed/26657806 http://dx.doi.org/10.1371/journal.pcbi.1004648 |
_version_ | 1782406406122504192 |
---|---|
author | Akam, Thomas Costa, Rui Dayan, Peter |
author_facet | Akam, Thomas Costa, Rui Dayan, Peter |
author_sort | Akam, Thomas |
collection | PubMed |
description | The recently developed ‘two-step’ behavioural task promises to differentiate model-based from model-free reinforcement learning, while generating neurophysiologically-friendly decision datasets with parametric variation of decision variables. These desirable features have prompted its widespread adoption. Here, we analyse the interactions between a range of different strategies and the structure of transitions and outcomes in order to examine constraints on what can be learned from behavioural performance. The task involves a trade-off between the need for stochasticity, to allow strategies to be discriminated, and a need for determinism, so that it is worth subjects’ investment of effort to exploit the contingencies optimally. We show through simulation that under certain conditions model-free strategies can masquerade as being model-based. We first show that seemingly innocuous modifications to the task structure can induce correlations between action values at the start of the trial and the subsequent trial events in such a way that analysis based on comparing successive trials can lead to erroneous conclusions. We confirm the power of a suggested correction to the analysis that can alleviate this problem. We then consider model-free reinforcement learning strategies that exploit correlations between where rewards are obtained and which actions have high expected value. These generate behaviour that appears model-based under these, and also more sophisticated, analyses. Exploiting the full potential of the two-step task as a tool for behavioural neuroscience requires an understanding of these issues. |
format | Online Article Text |
id | pubmed-4686094 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-46860942016-01-07 Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task Akam, Thomas Costa, Rui Dayan, Peter PLoS Comput Biol Research Article The recently developed ‘two-step’ behavioural task promises to differentiate model-based from model-free reinforcement learning, while generating neurophysiologically-friendly decision datasets with parametric variation of decision variables. These desirable features have prompted its widespread adoption. Here, we analyse the interactions between a range of different strategies and the structure of transitions and outcomes in order to examine constraints on what can be learned from behavioural performance. The task involves a trade-off between the need for stochasticity, to allow strategies to be discriminated, and a need for determinism, so that it is worth subjects’ investment of effort to exploit the contingencies optimally. We show through simulation that under certain conditions model-free strategies can masquerade as being model-based. We first show that seemingly innocuous modifications to the task structure can induce correlations between action values at the start of the trial and the subsequent trial events in such a way that analysis based on comparing successive trials can lead to erroneous conclusions. We confirm the power of a suggested correction to the analysis that can alleviate this problem. We then consider model-free reinforcement learning strategies that exploit correlations between where rewards are obtained and which actions have high expected value. These generate behaviour that appears model-based under these, and also more sophisticated, analyses. Exploiting the full potential of the two-step task as a tool for behavioural neuroscience requires an understanding of these issues. Public Library of Science 2015-12-11 /pmc/articles/PMC4686094/ /pubmed/26657806 http://dx.doi.org/10.1371/journal.pcbi.1004648 Text en © 2015 Akam et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Akam, Thomas Costa, Rui Dayan, Peter Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task |
title | Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task |
title_full | Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task |
title_fullStr | Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task |
title_full_unstemmed | Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task |
title_short | Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task |
title_sort | simple plans or sophisticated habits? state, transition and learning interactions in the two-step task |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4686094/ https://www.ncbi.nlm.nih.gov/pubmed/26657806 http://dx.doi.org/10.1371/journal.pcbi.1004648 |
work_keys_str_mv | AT akamthomas simpleplansorsophisticatedhabitsstatetransitionandlearninginteractionsinthetwosteptask AT costarui simpleplansorsophisticatedhabitsstatetransitionandlearninginteractionsinthetwosteptask AT dayanpeter simpleplansorsophisticatedhabitsstatetransitionandlearninginteractionsinthetwosteptask |