Cargando…

‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function

BACKGROUND: Reinforcement learning is a fundamental form of learning that may be formalized using the Bellman equation. Accordingly an agent determines the state value as the sum of immediate reward and of the discounted value of future states. Thus the value of state is determined by agent related...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zsuga, Judit, Biro, Klara, Tajti, Gabor, Szilasi, Magdolna Emma, Papp, Csaba, Juhasz, Bela, Gesztelyi, Rudolf
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2016
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5086043/ https://www.ncbi.nlm.nih.gov/pubmed/27793098 http://dx.doi.org/10.1186/s12868-016-0302-7

_version_	1782463671356620800
author	Zsuga, Judit Biro, Klara Tajti, Gabor Szilasi, Magdolna Emma Papp, Csaba Juhasz, Bela Gesztelyi, Rudolf
author_facet	Zsuga, Judit Biro, Klara Tajti, Gabor Szilasi, Magdolna Emma Papp, Csaba Juhasz, Bela Gesztelyi, Rudolf
author_sort	Zsuga, Judit
collection	PubMed
description	BACKGROUND: Reinforcement learning is a fundamental form of learning that may be formalized using the Bellman equation. Accordingly an agent determines the state value as the sum of immediate reward and of the discounted value of future states. Thus the value of state is determined by agent related attributes (action set, policy, discount factor) and the agent’s knowledge of the environment embodied by the reward function and hidden environmental factors given by the transition probability. The central objective of reinforcement learning is to solve these two functions outside the agent’s control either using, or not using a model. RESULTS: In the present paper, using the proactive model of reinforcement learning we offer insight on how the brain creates simplified representations of the environment, and how these representations are organized to support the identification of relevant stimuli and action. Furthermore, we identify neurobiological correlates of our model by suggesting that the reward and policy functions, attributes of the Bellman equitation, are built by the orbitofrontal cortex (OFC) and the anterior cingulate cortex (ACC), respectively. CONCLUSIONS: Based on this we propose that the OFC assesses cue-context congruence to activate the most context frame. Furthermore given the bidirectional neuroanatomical link between the OFC and model-free structures, we suggest that model-based input is incorporated into the reward prediction error (RPE) signal, and conversely RPE signal may be used to update the reward-related information of context frames and the policy underlying action selection in the OFC and ACC, respectively. Furthermore clinical implications for cognitive behavioral interventions are discussed.
format	Online Article Text
id	pubmed-5086043
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-50860432016-10-31 ‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function Zsuga, Judit Biro, Klara Tajti, Gabor Szilasi, Magdolna Emma Papp, Csaba Juhasz, Bela Gesztelyi, Rudolf BMC Neurosci Research Article BACKGROUND: Reinforcement learning is a fundamental form of learning that may be formalized using the Bellman equation. Accordingly an agent determines the state value as the sum of immediate reward and of the discounted value of future states. Thus the value of state is determined by agent related attributes (action set, policy, discount factor) and the agent’s knowledge of the environment embodied by the reward function and hidden environmental factors given by the transition probability. The central objective of reinforcement learning is to solve these two functions outside the agent’s control either using, or not using a model. RESULTS: In the present paper, using the proactive model of reinforcement learning we offer insight on how the brain creates simplified representations of the environment, and how these representations are organized to support the identification of relevant stimuli and action. Furthermore, we identify neurobiological correlates of our model by suggesting that the reward and policy functions, attributes of the Bellman equitation, are built by the orbitofrontal cortex (OFC) and the anterior cingulate cortex (ACC), respectively. CONCLUSIONS: Based on this we propose that the OFC assesses cue-context congruence to activate the most context frame. Furthermore given the bidirectional neuroanatomical link between the OFC and model-free structures, we suggest that model-based input is incorporated into the reward prediction error (RPE) signal, and conversely RPE signal may be used to update the reward-related information of context frames and the policy underlying action selection in the OFC and ACC, respectively. Furthermore clinical implications for cognitive behavioral interventions are discussed. BioMed Central 2016-10-28 /pmc/articles/PMC5086043/ /pubmed/27793098 http://dx.doi.org/10.1186/s12868-016-0302-7 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article Zsuga, Judit Biro, Klara Tajti, Gabor Szilasi, Magdolna Emma Papp, Csaba Juhasz, Bela Gesztelyi, Rudolf ‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function
title	‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function
title_full	‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function
title_fullStr	‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function
title_full_unstemmed	‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function
title_short	‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function
title_sort	‘proactive’ use of cue-context congruence for building reinforcement learning’s reward function
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5086043/ https://www.ncbi.nlm.nih.gov/pubmed/27793098 http://dx.doi.org/10.1186/s12868-016-0302-7
work_keys_str_mv	AT zsugajudit proactiveuseofcuecontextcongruenceforbuildingreinforcementlearningsrewardfunction AT biroklara proactiveuseofcuecontextcongruenceforbuildingreinforcementlearningsrewardfunction AT tajtigabor proactiveuseofcuecontextcongruenceforbuildingreinforcementlearningsrewardfunction AT szilasimagdolnaemma proactiveuseofcuecontextcongruenceforbuildingreinforcementlearningsrewardfunction AT pappcsaba proactiveuseofcuecontextcongruenceforbuildingreinforcementlearningsrewardfunction AT juhaszbela proactiveuseofcuecontextcongruenceforbuildingreinforcementlearningsrewardfunction AT gesztelyirudolf proactiveuseofcuecontextcongruenceforbuildingreinforcementlearningsrewardfunction

‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function

Ejemplares similares