Cargando…
Dopamine reward prediction errors reflect hidden state inference across time
Midbrain dopamine neurons signal reward prediction error (RPE), or actual minus expected reward. The temporal difference (TD) learning model has been a cornerstone in understanding how dopamine RPEs could drive associative learning. Classically, TD learning imparts value to features that serially tr...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5374025/ https://www.ncbi.nlm.nih.gov/pubmed/28263301 http://dx.doi.org/10.1038/nn.4520 |
_version_ | 1782518826920837120 |
---|---|
author | Starkweather, Clara Kwon Babayan, Benedicte M. Uchida, Naoshige Gershman, Samuel J. |
author_facet | Starkweather, Clara Kwon Babayan, Benedicte M. Uchida, Naoshige Gershman, Samuel J. |
author_sort | Starkweather, Clara Kwon |
collection | PubMed |
description | Midbrain dopamine neurons signal reward prediction error (RPE), or actual minus expected reward. The temporal difference (TD) learning model has been a cornerstone in understanding how dopamine RPEs could drive associative learning. Classically, TD learning imparts value to features that serially track elapsed time relative to observable stimuli. In the real world, however, sensory stimuli provide ambiguous information about the hidden state of the environment, leading to the proposal that TD learning might instead compute a value signal based on an inferred distribution of hidden states (a ‘belief state’). In this work, we asked whether dopaminergic signaling supports a TD learning framework that operates over hidden states. We found that dopamine signaling exhibited a striking difference between two tasks that differed only with respect to whether reward was delivered deterministically. Our results favor an associative learning rule that combines cached values with hidden state inference. |
format | Online Article Text |
id | pubmed-5374025 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
record_format | MEDLINE/PubMed |
spelling | pubmed-53740252017-09-06 Dopamine reward prediction errors reflect hidden state inference across time Starkweather, Clara Kwon Babayan, Benedicte M. Uchida, Naoshige Gershman, Samuel J. Nat Neurosci Article Midbrain dopamine neurons signal reward prediction error (RPE), or actual minus expected reward. The temporal difference (TD) learning model has been a cornerstone in understanding how dopamine RPEs could drive associative learning. Classically, TD learning imparts value to features that serially track elapsed time relative to observable stimuli. In the real world, however, sensory stimuli provide ambiguous information about the hidden state of the environment, leading to the proposal that TD learning might instead compute a value signal based on an inferred distribution of hidden states (a ‘belief state’). In this work, we asked whether dopaminergic signaling supports a TD learning framework that operates over hidden states. We found that dopamine signaling exhibited a striking difference between two tasks that differed only with respect to whether reward was delivered deterministically. Our results favor an associative learning rule that combines cached values with hidden state inference. 2017-03-06 2017-04 /pmc/articles/PMC5374025/ /pubmed/28263301 http://dx.doi.org/10.1038/nn.4520 Text en Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:http://www.nature.com/authors/editorial_policies/license.html#terms |
spellingShingle | Article Starkweather, Clara Kwon Babayan, Benedicte M. Uchida, Naoshige Gershman, Samuel J. Dopamine reward prediction errors reflect hidden state inference across time |
title | Dopamine reward prediction errors reflect hidden state inference across time |
title_full | Dopamine reward prediction errors reflect hidden state inference across time |
title_fullStr | Dopamine reward prediction errors reflect hidden state inference across time |
title_full_unstemmed | Dopamine reward prediction errors reflect hidden state inference across time |
title_short | Dopamine reward prediction errors reflect hidden state inference across time |
title_sort | dopamine reward prediction errors reflect hidden state inference across time |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5374025/ https://www.ncbi.nlm.nih.gov/pubmed/28263301 http://dx.doi.org/10.1038/nn.4520 |
work_keys_str_mv | AT starkweatherclarakwon dopaminerewardpredictionerrorsreflecthiddenstateinferenceacrosstime AT babayanbenedictem dopaminerewardpredictionerrorsreflecthiddenstateinferenceacrosstime AT uchidanaoshige dopaminerewardpredictionerrorsreflecthiddenstateinferenceacrosstime AT gershmansamuelj dopaminerewardpredictionerrorsreflecthiddenstateinferenceacrosstime |