Cargando…
Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T
The model‐free algorithms of “reinforcement learning” (RL) have gained clout across disciplines, but so too have model‐based alternatives. The present study emphasizes other dimensions of this model space in consideration of associative or discriminative generalization across states and actions. Thi...
Autores principales: | , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley & Sons, Inc.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9491297/ https://www.ncbi.nlm.nih.gov/pubmed/35860954 http://dx.doi.org/10.1002/hbm.25988 |
_version_ | 1784793254188810240 |
---|---|
author | Colas, Jaron T. Dundon, Neil M. Gerraty, Raphael T. Saragosa‐Harris, Natalie M. Szymula, Karol P. Tanwisuth, Koranis Tyszka, J. Michael van Geen, Camilla Ju, Harang Toga, Arthur W. Gold, Joshua I. Bassett, Dani S. Hartley, Catherine A. Shohamy, Daphna Grafton, Scott T. O'Doherty, John P. |
author_facet | Colas, Jaron T. Dundon, Neil M. Gerraty, Raphael T. Saragosa‐Harris, Natalie M. Szymula, Karol P. Tanwisuth, Koranis Tyszka, J. Michael van Geen, Camilla Ju, Harang Toga, Arthur W. Gold, Joshua I. Bassett, Dani S. Hartley, Catherine A. Shohamy, Daphna Grafton, Scott T. O'Doherty, John P. |
author_sort | Colas, Jaron T. |
collection | PubMed |
description | The model‐free algorithms of “reinforcement learning” (RL) have gained clout across disciplines, but so too have model‐based alternatives. The present study emphasizes other dimensions of this model space in consideration of associative or discriminative generalization across states and actions. This “generalized reinforcement learning” (GRL) model, a frugal extension of RL, parsimoniously retains the single reward‐prediction error (RPE), but the scope of learning goes beyond the experienced state and action. Instead, the generalized RPE is efficiently relayed for bidirectional counterfactual updating of value estimates for other representations. Aided by structural information but as an implicit rather than explicit cognitive map, GRL provided the most precise account of human behavior and individual differences in a reversal‐learning task with hierarchical structure that encouraged inverse generalization across both states and actions. Reflecting inference that could be true, false (i.e., overgeneralization), or absent (i.e., undergeneralization), state generalization distinguished those who learned well more so than action generalization. With high‐resolution high‐field fMRI targeting the dopaminergic midbrain, the GRL model's RPE signals (alongside value and decision signals) were localized within not only the striatum but also the substantia nigra and the ventral tegmental area, including specific effects of generalization that also extend to the hippocampus. Factoring in generalization as a multidimensional process in value‐based learning, these findings shed light on complexities that, while challenging classic RL, can still be resolved within the bounds of its core computations. |
format | Online Article Text |
id | pubmed-9491297 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | John Wiley & Sons, Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-94912972022-09-30 Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T Colas, Jaron T. Dundon, Neil M. Gerraty, Raphael T. Saragosa‐Harris, Natalie M. Szymula, Karol P. Tanwisuth, Koranis Tyszka, J. Michael van Geen, Camilla Ju, Harang Toga, Arthur W. Gold, Joshua I. Bassett, Dani S. Hartley, Catherine A. Shohamy, Daphna Grafton, Scott T. O'Doherty, John P. Hum Brain Mapp Research Articles The model‐free algorithms of “reinforcement learning” (RL) have gained clout across disciplines, but so too have model‐based alternatives. The present study emphasizes other dimensions of this model space in consideration of associative or discriminative generalization across states and actions. This “generalized reinforcement learning” (GRL) model, a frugal extension of RL, parsimoniously retains the single reward‐prediction error (RPE), but the scope of learning goes beyond the experienced state and action. Instead, the generalized RPE is efficiently relayed for bidirectional counterfactual updating of value estimates for other representations. Aided by structural information but as an implicit rather than explicit cognitive map, GRL provided the most precise account of human behavior and individual differences in a reversal‐learning task with hierarchical structure that encouraged inverse generalization across both states and actions. Reflecting inference that could be true, false (i.e., overgeneralization), or absent (i.e., undergeneralization), state generalization distinguished those who learned well more so than action generalization. With high‐resolution high‐field fMRI targeting the dopaminergic midbrain, the GRL model's RPE signals (alongside value and decision signals) were localized within not only the striatum but also the substantia nigra and the ventral tegmental area, including specific effects of generalization that also extend to the hippocampus. Factoring in generalization as a multidimensional process in value‐based learning, these findings shed light on complexities that, while challenging classic RL, can still be resolved within the bounds of its core computations. John Wiley & Sons, Inc. 2022-07-21 /pmc/articles/PMC9491297/ /pubmed/35860954 http://dx.doi.org/10.1002/hbm.25988 Text en © 2022 The Authors. Human Brain Mapping published by Wiley Periodicals LLC. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Articles Colas, Jaron T. Dundon, Neil M. Gerraty, Raphael T. Saragosa‐Harris, Natalie M. Szymula, Karol P. Tanwisuth, Koranis Tyszka, J. Michael van Geen, Camilla Ju, Harang Toga, Arthur W. Gold, Joshua I. Bassett, Dani S. Hartley, Catherine A. Shohamy, Daphna Grafton, Scott T. O'Doherty, John P. Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T |
title | Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T |
title_full | Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T |
title_fullStr | Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T |
title_full_unstemmed | Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T |
title_short | Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T |
title_sort | reinforcement learning with associative or discriminative generalization across states and actions: fmri at 3 t and 7 t |
topic | Research Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9491297/ https://www.ncbi.nlm.nih.gov/pubmed/35860954 http://dx.doi.org/10.1002/hbm.25988 |
work_keys_str_mv | AT colasjaront reinforcementlearningwithassociativeordiscriminativegeneralizationacrossstatesandactionsfmriat3tand7t AT dundonneilm reinforcementlearningwithassociativeordiscriminativegeneralizationacrossstatesandactionsfmriat3tand7t AT gerratyraphaelt reinforcementlearningwithassociativeordiscriminativegeneralizationacrossstatesandactionsfmriat3tand7t AT saragosaharrisnataliem reinforcementlearningwithassociativeordiscriminativegeneralizationacrossstatesandactionsfmriat3tand7t AT szymulakarolp reinforcementlearningwithassociativeordiscriminativegeneralizationacrossstatesandactionsfmriat3tand7t AT tanwisuthkoranis reinforcementlearningwithassociativeordiscriminativegeneralizationacrossstatesandactionsfmriat3tand7t AT tyszkajmichael reinforcementlearningwithassociativeordiscriminativegeneralizationacrossstatesandactionsfmriat3tand7t AT vangeencamilla reinforcementlearningwithassociativeordiscriminativegeneralizationacrossstatesandactionsfmriat3tand7t AT juharang reinforcementlearningwithassociativeordiscriminativegeneralizationacrossstatesandactionsfmriat3tand7t AT togaarthurw reinforcementlearningwithassociativeordiscriminativegeneralizationacrossstatesandactionsfmriat3tand7t AT goldjoshuai reinforcementlearningwithassociativeordiscriminativegeneralizationacrossstatesandactionsfmriat3tand7t AT bassettdanis reinforcementlearningwithassociativeordiscriminativegeneralizationacrossstatesandactionsfmriat3tand7t AT hartleycatherinea reinforcementlearningwithassociativeordiscriminativegeneralizationacrossstatesandactionsfmriat3tand7t AT shohamydaphna reinforcementlearningwithassociativeordiscriminativegeneralizationacrossstatesandactionsfmriat3tand7t AT graftonscottt reinforcementlearningwithassociativeordiscriminativegeneralizationacrossstatesandactionsfmriat3tand7t AT odohertyjohnp reinforcementlearningwithassociativeordiscriminativegeneralizationacrossstatesandactionsfmriat3tand7t |