Cargando…

Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning

Surprise occurs because of differences between a decision outcome and its predicted outcome (prediction error), regardless of whether the error is positive or negative. It has recently been postulated that surprise affects the reward value of the action outcome; studies have indicated that increasin...

Descripción completa

Detalles Bibliográficos
Autores principales: Sumiya, Motofumi, Katahira, Kentaro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7506125/
https://www.ncbi.nlm.nih.gov/pubmed/33013288
http://dx.doi.org/10.3389/fnins.2020.00852
_version_ 1783584964450713600
author Sumiya, Motofumi
Katahira, Kentaro
author_facet Sumiya, Motofumi
Katahira, Kentaro
author_sort Sumiya, Motofumi
collection PubMed
description Surprise occurs because of differences between a decision outcome and its predicted outcome (prediction error), regardless of whether the error is positive or negative. It has recently been postulated that surprise affects the reward value of the action outcome; studies have indicated that increasing surprise as an absolute value of prediction error decreases the value of the outcome. However, how surprise affects the value of the outcome and subsequent decision making is unclear. We suggest that, on the assumption that surprise decreases the outcome value, agents will increase their risk-averse choices when an outcome is often surprising. Here, we propose the surprise-sensitive utility model, a reinforcement learning model that states that surprise decreases the outcome value, to explain how surprise affects subsequent decision making. To investigate the properties of the proposed model, we compare the model with previous reinforcement learning models on two probabilistic learning tasks by simulations. As a result, the proposed model explains the risk-averse choices like the previous models, and the risk-averse choices increase as the surprise-based modulation parameter of outcome value increases. We also performed statistical model selection by using two experimental datasets with different tasks. The proposed model fits these datasets better than the other models with the same number of free parameters, indicating that the model can better capture the trial-by-trial dynamics of choice behavior.
format Online
Article
Text
id pubmed-7506125
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-75061252020-10-02 Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning Sumiya, Motofumi Katahira, Kentaro Front Neurosci Neuroscience Surprise occurs because of differences between a decision outcome and its predicted outcome (prediction error), regardless of whether the error is positive or negative. It has recently been postulated that surprise affects the reward value of the action outcome; studies have indicated that increasing surprise as an absolute value of prediction error decreases the value of the outcome. However, how surprise affects the value of the outcome and subsequent decision making is unclear. We suggest that, on the assumption that surprise decreases the outcome value, agents will increase their risk-averse choices when an outcome is often surprising. Here, we propose the surprise-sensitive utility model, a reinforcement learning model that states that surprise decreases the outcome value, to explain how surprise affects subsequent decision making. To investigate the properties of the proposed model, we compare the model with previous reinforcement learning models on two probabilistic learning tasks by simulations. As a result, the proposed model explains the risk-averse choices like the previous models, and the risk-averse choices increase as the surprise-based modulation parameter of outcome value increases. We also performed statistical model selection by using two experimental datasets with different tasks. The proposed model fits these datasets better than the other models with the same number of free parameters, indicating that the model can better capture the trial-by-trial dynamics of choice behavior. Frontiers Media S.A. 2020-09-08 /pmc/articles/PMC7506125/ /pubmed/33013288 http://dx.doi.org/10.3389/fnins.2020.00852 Text en Copyright © 2020 Sumiya and Katahira. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Sumiya, Motofumi
Katahira, Kentaro
Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning
title Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning
title_full Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning
title_fullStr Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning
title_full_unstemmed Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning
title_short Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning
title_sort surprise acts as a reducer of outcome value in human reinforcement learning
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7506125/
https://www.ncbi.nlm.nih.gov/pubmed/33013288
http://dx.doi.org/10.3389/fnins.2020.00852
work_keys_str_mv AT sumiyamotofumi surpriseactsasareducerofoutcomevalueinhumanreinforcementlearning
AT katahirakentaro surpriseactsasareducerofoutcomevalueinhumanreinforcementlearning