Cargando…

Dopamine reward prediction errors reflect hidden state inference across time

Midbrain dopamine neurons signal reward prediction error (RPE), or actual minus expected reward. The temporal difference (TD) learning model has been a cornerstone in understanding how dopamine RPEs could drive associative learning. Classically, TD learning imparts value to features that serially tr...

Descripción completa

Detalles Bibliográficos
Autores principales:	Starkweather, Clara Kwon, Babayan, Benedicte M., Uchida, Naoshige, Gershman, Samuel J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	2017
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5374025/ https://www.ncbi.nlm.nih.gov/pubmed/28263301 http://dx.doi.org/10.1038/nn.4520

_version_	1782518826920837120
author	Starkweather, Clara Kwon Babayan, Benedicte M. Uchida, Naoshige Gershman, Samuel J.
author_facet	Starkweather, Clara Kwon Babayan, Benedicte M. Uchida, Naoshige Gershman, Samuel J.
author_sort	Starkweather, Clara Kwon
collection	PubMed
description	Midbrain dopamine neurons signal reward prediction error (RPE), or actual minus expected reward. The temporal difference (TD) learning model has been a cornerstone in understanding how dopamine RPEs could drive associative learning. Classically, TD learning imparts value to features that serially track elapsed time relative to observable stimuli. In the real world, however, sensory stimuli provide ambiguous information about the hidden state of the environment, leading to the proposal that TD learning might instead compute a value signal based on an inferred distribution of hidden states (a ‘belief state’). In this work, we asked whether dopaminergic signaling supports a TD learning framework that operates over hidden states. We found that dopamine signaling exhibited a striking difference between two tasks that differed only with respect to whether reward was delivered deterministically. Our results favor an associative learning rule that combines cached values with hidden state inference.
format	Online Article Text
id	pubmed-5374025
institution	National Center for Biotechnology Information
language	English
publishDate	2017
record_format	MEDLINE/PubMed
spelling	pubmed-53740252017-09-06 Dopamine reward prediction errors reflect hidden state inference across time Starkweather, Clara Kwon Babayan, Benedicte M. Uchida, Naoshige Gershman, Samuel J. Nat Neurosci Article Midbrain dopamine neurons signal reward prediction error (RPE), or actual minus expected reward. The temporal difference (TD) learning model has been a cornerstone in understanding how dopamine RPEs could drive associative learning. Classically, TD learning imparts value to features that serially track elapsed time relative to observable stimuli. In the real world, however, sensory stimuli provide ambiguous information about the hidden state of the environment, leading to the proposal that TD learning might instead compute a value signal based on an inferred distribution of hidden states (a ‘belief state’). In this work, we asked whether dopaminergic signaling supports a TD learning framework that operates over hidden states. We found that dopamine signaling exhibited a striking difference between two tasks that differed only with respect to whether reward was delivered deterministically. Our results favor an associative learning rule that combines cached values with hidden state inference. 2017-03-06 2017-04 /pmc/articles/PMC5374025/ /pubmed/28263301 http://dx.doi.org/10.1038/nn.4520 Text en Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:http://www.nature.com/authors/editorial_policies/license.html#terms
spellingShingle	Article Starkweather, Clara Kwon Babayan, Benedicte M. Uchida, Naoshige Gershman, Samuel J. Dopamine reward prediction errors reflect hidden state inference across time
title	Dopamine reward prediction errors reflect hidden state inference across time
title_full	Dopamine reward prediction errors reflect hidden state inference across time
title_fullStr	Dopamine reward prediction errors reflect hidden state inference across time
title_full_unstemmed	Dopamine reward prediction errors reflect hidden state inference across time
title_short	Dopamine reward prediction errors reflect hidden state inference across time
title_sort	dopamine reward prediction errors reflect hidden state inference across time
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5374025/ https://www.ncbi.nlm.nih.gov/pubmed/28263301 http://dx.doi.org/10.1038/nn.4520
work_keys_str_mv	AT starkweatherclarakwon dopaminerewardpredictionerrorsreflecthiddenstateinferenceacrosstime AT babayanbenedictem dopaminerewardpredictionerrorsreflecthiddenstateinferenceacrosstime AT uchidanaoshige dopaminerewardpredictionerrorsreflecthiddenstateinferenceacrosstime AT gershmansamuelj dopaminerewardpredictionerrorsreflecthiddenstateinferenceacrosstime

Dopamine reward prediction errors reflect hidden state inference across time

Ejemplares similares