Cargando…
Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning
Surprise occurs because of differences between a decision outcome and its predicted outcome (prediction error), regardless of whether the error is positive or negative. It has recently been postulated that surprise affects the reward value of the action outcome; studies have indicated that increasin...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7506125/ https://www.ncbi.nlm.nih.gov/pubmed/33013288 http://dx.doi.org/10.3389/fnins.2020.00852 |
_version_ | 1783584964450713600 |
---|---|
author | Sumiya, Motofumi Katahira, Kentaro |
author_facet | Sumiya, Motofumi Katahira, Kentaro |
author_sort | Sumiya, Motofumi |
collection | PubMed |
description | Surprise occurs because of differences between a decision outcome and its predicted outcome (prediction error), regardless of whether the error is positive or negative. It has recently been postulated that surprise affects the reward value of the action outcome; studies have indicated that increasing surprise as an absolute value of prediction error decreases the value of the outcome. However, how surprise affects the value of the outcome and subsequent decision making is unclear. We suggest that, on the assumption that surprise decreases the outcome value, agents will increase their risk-averse choices when an outcome is often surprising. Here, we propose the surprise-sensitive utility model, a reinforcement learning model that states that surprise decreases the outcome value, to explain how surprise affects subsequent decision making. To investigate the properties of the proposed model, we compare the model with previous reinforcement learning models on two probabilistic learning tasks by simulations. As a result, the proposed model explains the risk-averse choices like the previous models, and the risk-averse choices increase as the surprise-based modulation parameter of outcome value increases. We also performed statistical model selection by using two experimental datasets with different tasks. The proposed model fits these datasets better than the other models with the same number of free parameters, indicating that the model can better capture the trial-by-trial dynamics of choice behavior. |
format | Online Article Text |
id | pubmed-7506125 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-75061252020-10-02 Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning Sumiya, Motofumi Katahira, Kentaro Front Neurosci Neuroscience Surprise occurs because of differences between a decision outcome and its predicted outcome (prediction error), regardless of whether the error is positive or negative. It has recently been postulated that surprise affects the reward value of the action outcome; studies have indicated that increasing surprise as an absolute value of prediction error decreases the value of the outcome. However, how surprise affects the value of the outcome and subsequent decision making is unclear. We suggest that, on the assumption that surprise decreases the outcome value, agents will increase their risk-averse choices when an outcome is often surprising. Here, we propose the surprise-sensitive utility model, a reinforcement learning model that states that surprise decreases the outcome value, to explain how surprise affects subsequent decision making. To investigate the properties of the proposed model, we compare the model with previous reinforcement learning models on two probabilistic learning tasks by simulations. As a result, the proposed model explains the risk-averse choices like the previous models, and the risk-averse choices increase as the surprise-based modulation parameter of outcome value increases. We also performed statistical model selection by using two experimental datasets with different tasks. The proposed model fits these datasets better than the other models with the same number of free parameters, indicating that the model can better capture the trial-by-trial dynamics of choice behavior. Frontiers Media S.A. 2020-09-08 /pmc/articles/PMC7506125/ /pubmed/33013288 http://dx.doi.org/10.3389/fnins.2020.00852 Text en Copyright © 2020 Sumiya and Katahira. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Neuroscience Sumiya, Motofumi Katahira, Kentaro Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning |
title | Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning |
title_full | Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning |
title_fullStr | Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning |
title_full_unstemmed | Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning |
title_short | Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning |
title_sort | surprise acts as a reducer of outcome value in human reinforcement learning |
topic | Neuroscience |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7506125/ https://www.ncbi.nlm.nih.gov/pubmed/33013288 http://dx.doi.org/10.3389/fnins.2020.00852 |
work_keys_str_mv | AT sumiyamotofumi surpriseactsasareducerofoutcomevalueinhumanreinforcementlearning AT katahirakentaro surpriseactsasareducerofoutcomevalueinhumanreinforcementlearning |