Cargando…

Uncertainty–guided learning with scaled prediction errors in the basal ganglia

To accurately predict rewards associated with states or actions, the variability of observations has to be taken into account. In particular, when the observations are noisy, the individual rewards should have less influence on tracking of average reward, and the estimate of the mean reward should b...

Descripción completa

Detalles Bibliográficos
Autores principales:	Möller, Moritz, Manohar, Sanjay, Bogacz, Rafal
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2022
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9182698/ https://www.ncbi.nlm.nih.gov/pubmed/35622863 http://dx.doi.org/10.1371/journal.pcbi.1009816

_version_	1784724100069982208
author	Möller, Moritz Manohar, Sanjay Bogacz, Rafal
author_facet	Möller, Moritz Manohar, Sanjay Bogacz, Rafal
author_sort	Möller, Moritz
collection	PubMed
description	To accurately predict rewards associated with states or actions, the variability of observations has to be taken into account. In particular, when the observations are noisy, the individual rewards should have less influence on tracking of average reward, and the estimate of the mean reward should be updated to a smaller extent after each observation. However, it is not known how the magnitude of the observation noise might be tracked and used to control prediction updates in the brain reward system. Here, we introduce a new model that uses simple, tractable learning rules that track the mean and standard deviation of reward, and leverages prediction errors scaled by uncertainty as the central feedback signal. We show that the new model has an advantage over conventional reinforcement learning models in a value tracking task, and approaches a theoretic limit of performance provided by the Kalman filter. Further, we propose a possible biological implementation of the model in the basal ganglia circuit. In the proposed network, dopaminergic neurons encode reward prediction errors scaled by standard deviation of rewards. We show that such scaling may arise if the striatal neurons learn the standard deviation of rewards and modulate the activity of dopaminergic neurons. The model is consistent with experimental findings concerning dopamine prediction error scaling relative to reward magnitude, and with many features of striatal plasticity. Our results span across the levels of implementation, algorithm, and computation, and might have important implications for understanding the dopaminergic prediction error signal and its relation to adaptive and effective learning.
format	Online Article Text
id	pubmed-9182698
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-91826982022-06-10 Uncertainty–guided learning with scaled prediction errors in the basal ganglia Möller, Moritz Manohar, Sanjay Bogacz, Rafal PLoS Comput Biol Research Article To accurately predict rewards associated with states or actions, the variability of observations has to be taken into account. In particular, when the observations are noisy, the individual rewards should have less influence on tracking of average reward, and the estimate of the mean reward should be updated to a smaller extent after each observation. However, it is not known how the magnitude of the observation noise might be tracked and used to control prediction updates in the brain reward system. Here, we introduce a new model that uses simple, tractable learning rules that track the mean and standard deviation of reward, and leverages prediction errors scaled by uncertainty as the central feedback signal. We show that the new model has an advantage over conventional reinforcement learning models in a value tracking task, and approaches a theoretic limit of performance provided by the Kalman filter. Further, we propose a possible biological implementation of the model in the basal ganglia circuit. In the proposed network, dopaminergic neurons encode reward prediction errors scaled by standard deviation of rewards. We show that such scaling may arise if the striatal neurons learn the standard deviation of rewards and modulate the activity of dopaminergic neurons. The model is consistent with experimental findings concerning dopamine prediction error scaling relative to reward magnitude, and with many features of striatal plasticity. Our results span across the levels of implementation, algorithm, and computation, and might have important implications for understanding the dopaminergic prediction error signal and its relation to adaptive and effective learning. Public Library of Science 2022-05-27 /pmc/articles/PMC9182698/ /pubmed/35622863 http://dx.doi.org/10.1371/journal.pcbi.1009816 Text en © 2022 Möller et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Möller, Moritz Manohar, Sanjay Bogacz, Rafal Uncertainty–guided learning with scaled prediction errors in the basal ganglia
title	Uncertainty–guided learning with scaled prediction errors in the basal ganglia
title_full	Uncertainty–guided learning with scaled prediction errors in the basal ganglia
title_fullStr	Uncertainty–guided learning with scaled prediction errors in the basal ganglia
title_full_unstemmed	Uncertainty–guided learning with scaled prediction errors in the basal ganglia
title_short	Uncertainty–guided learning with scaled prediction errors in the basal ganglia
title_sort	uncertainty–guided learning with scaled prediction errors in the basal ganglia
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9182698/ https://www.ncbi.nlm.nih.gov/pubmed/35622863 http://dx.doi.org/10.1371/journal.pcbi.1009816
work_keys_str_mv	AT mollermoritz uncertaintyguidedlearningwithscaledpredictionerrorsinthebasalganglia AT manoharsanjay uncertaintyguidedlearningwithscaledpredictionerrorsinthebasalganglia AT bogaczrafal uncertaintyguidedlearningwithscaledpredictionerrorsinthebasalganglia

Uncertainty–guided learning with scaled prediction errors in the basal ganglia

Ejemplares similares