Cargando…

‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function

BACKGROUND: Reinforcement learning is a fundamental form of learning that may be formalized using the Bellman equation. Accordingly an agent determines the state value as the sum of immediate reward and of the discounted value of future states. Thus the value of state is determined by agent related...

Descripción completa

Detalles Bibliográficos
Autores principales: Zsuga, Judit, Biro, Klara, Tajti, Gabor, Szilasi, Magdolna Emma, Papp, Csaba, Juhasz, Bela, Gesztelyi, Rudolf
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5086043/
https://www.ncbi.nlm.nih.gov/pubmed/27793098
http://dx.doi.org/10.1186/s12868-016-0302-7
_version_ 1782463671356620800
author Zsuga, Judit
Biro, Klara
Tajti, Gabor
Szilasi, Magdolna Emma
Papp, Csaba
Juhasz, Bela
Gesztelyi, Rudolf
author_facet Zsuga, Judit
Biro, Klara
Tajti, Gabor
Szilasi, Magdolna Emma
Papp, Csaba
Juhasz, Bela
Gesztelyi, Rudolf
author_sort Zsuga, Judit
collection PubMed
description BACKGROUND: Reinforcement learning is a fundamental form of learning that may be formalized using the Bellman equation. Accordingly an agent determines the state value as the sum of immediate reward and of the discounted value of future states. Thus the value of state is determined by agent related attributes (action set, policy, discount factor) and the agent’s knowledge of the environment embodied by the reward function and hidden environmental factors given by the transition probability. The central objective of reinforcement learning is to solve these two functions outside the agent’s control either using, or not using a model. RESULTS: In the present paper, using the proactive model of reinforcement learning we offer insight on how the brain creates simplified representations of the environment, and how these representations are organized to support the identification of relevant stimuli and action. Furthermore, we identify neurobiological correlates of our model by suggesting that the reward and policy functions, attributes of the Bellman equitation, are built by the orbitofrontal cortex (OFC) and the anterior cingulate cortex (ACC), respectively. CONCLUSIONS: Based on this we propose that the OFC assesses cue-context congruence to activate the most context frame. Furthermore given the bidirectional neuroanatomical link between the OFC and model-free structures, we suggest that model-based input is incorporated into the reward prediction error (RPE) signal, and conversely RPE signal may be used to update the reward-related information of context frames and the policy underlying action selection in the OFC and ACC, respectively. Furthermore clinical implications for cognitive behavioral interventions are discussed.
format Online
Article
Text
id pubmed-5086043
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50860432016-10-31 ‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function Zsuga, Judit Biro, Klara Tajti, Gabor Szilasi, Magdolna Emma Papp, Csaba Juhasz, Bela Gesztelyi, Rudolf BMC Neurosci Research Article BACKGROUND: Reinforcement learning is a fundamental form of learning that may be formalized using the Bellman equation. Accordingly an agent determines the state value as the sum of immediate reward and of the discounted value of future states. Thus the value of state is determined by agent related attributes (action set, policy, discount factor) and the agent’s knowledge of the environment embodied by the reward function and hidden environmental factors given by the transition probability. The central objective of reinforcement learning is to solve these two functions outside the agent’s control either using, or not using a model. RESULTS: In the present paper, using the proactive model of reinforcement learning we offer insight on how the brain creates simplified representations of the environment, and how these representations are organized to support the identification of relevant stimuli and action. Furthermore, we identify neurobiological correlates of our model by suggesting that the reward and policy functions, attributes of the Bellman equitation, are built by the orbitofrontal cortex (OFC) and the anterior cingulate cortex (ACC), respectively. CONCLUSIONS: Based on this we propose that the OFC assesses cue-context congruence to activate the most context frame. Furthermore given the bidirectional neuroanatomical link between the OFC and model-free structures, we suggest that model-based input is incorporated into the reward prediction error (RPE) signal, and conversely RPE signal may be used to update the reward-related information of context frames and the policy underlying action selection in the OFC and ACC, respectively. Furthermore clinical implications for cognitive behavioral interventions are discussed. BioMed Central 2016-10-28 /pmc/articles/PMC5086043/ /pubmed/27793098 http://dx.doi.org/10.1186/s12868-016-0302-7 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Zsuga, Judit
Biro, Klara
Tajti, Gabor
Szilasi, Magdolna Emma
Papp, Csaba
Juhasz, Bela
Gesztelyi, Rudolf
‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function
title ‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function
title_full ‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function
title_fullStr ‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function
title_full_unstemmed ‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function
title_short ‘Proactive’ use of cue-context congruence for building reinforcement learning’s reward function
title_sort ‘proactive’ use of cue-context congruence for building reinforcement learning’s reward function
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5086043/
https://www.ncbi.nlm.nih.gov/pubmed/27793098
http://dx.doi.org/10.1186/s12868-016-0302-7
work_keys_str_mv AT zsugajudit proactiveuseofcuecontextcongruenceforbuildingreinforcementlearningsrewardfunction
AT biroklara proactiveuseofcuecontextcongruenceforbuildingreinforcementlearningsrewardfunction
AT tajtigabor proactiveuseofcuecontextcongruenceforbuildingreinforcementlearningsrewardfunction
AT szilasimagdolnaemma proactiveuseofcuecontextcongruenceforbuildingreinforcementlearningsrewardfunction
AT pappcsaba proactiveuseofcuecontextcongruenceforbuildingreinforcementlearningsrewardfunction
AT juhaszbela proactiveuseofcuecontextcongruenceforbuildingreinforcementlearningsrewardfunction
AT gesztelyirudolf proactiveuseofcuecontextcongruenceforbuildingreinforcementlearningsrewardfunction