Cargando…

What is value—accumulated reward or evidence?

Why are you reading this abstract? In some sense, your answer will cast the exercise as valuable—but what is value? In what follows, we suggest that value is evidence or, more exactly, log Bayesian evidence. This implies that a sufficient explanation for valuable behavior is the accumulation of evid...

Descripción completa

Detalles Bibliográficos
Autores principales:	Friston, Karl, Adams, Rick, Montague, Read
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2012
Materias:	Neuroscience
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3487150/ https://www.ncbi.nlm.nih.gov/pubmed/23133414 http://dx.doi.org/10.3389/fnbot.2012.00011

_version_	1782248439749279744
author	Friston, Karl Adams, Rick Montague, Read
author_facet	Friston, Karl Adams, Rick Montague, Read
author_sort	Friston, Karl
collection	PubMed
description	Why are you reading this abstract? In some sense, your answer will cast the exercise as valuable—but what is value? In what follows, we suggest that value is evidence or, more exactly, log Bayesian evidence. This implies that a sufficient explanation for valuable behavior is the accumulation of evidence for internal models of our world. This contrasts with normative models of optimal control and reinforcement learning, which assume the existence of a value function that explains behavior, where (somewhat tautologically) behavior maximizes value. In this paper, we consider an alternative formulation—active inference—that replaces policies in normative models with prior beliefs about the (future) states agents should occupy. This enables optimal behavior to be cast purely in terms of inference: where agents sample their sensorium to maximize the evidence for their generative model of hidden states in the world, and minimize their uncertainty about those states. Crucially, this formulation resolves the tautology inherent in normative models and allows one to consider how prior beliefs are themselves optimized in a hierarchical setting. We illustrate these points by showing that any optimal policy can be specified with prior beliefs in the context of Bayesian inference. We then show how these prior beliefs are themselves prescribed by an imperative to minimize uncertainty. This formulation explains the saccadic eye movements required to read this text and defines the value of the visual sensations you are soliciting.
format	Online Article Text
id	pubmed-3487150
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-34871502012-11-06 What is value—accumulated reward or evidence? Friston, Karl Adams, Rick Montague, Read Front Neurorobot Neuroscience Why are you reading this abstract? In some sense, your answer will cast the exercise as valuable—but what is value? In what follows, we suggest that value is evidence or, more exactly, log Bayesian evidence. This implies that a sufficient explanation for valuable behavior is the accumulation of evidence for internal models of our world. This contrasts with normative models of optimal control and reinforcement learning, which assume the existence of a value function that explains behavior, where (somewhat tautologically) behavior maximizes value. In this paper, we consider an alternative formulation—active inference—that replaces policies in normative models with prior beliefs about the (future) states agents should occupy. This enables optimal behavior to be cast purely in terms of inference: where agents sample their sensorium to maximize the evidence for their generative model of hidden states in the world, and minimize their uncertainty about those states. Crucially, this formulation resolves the tautology inherent in normative models and allows one to consider how prior beliefs are themselves optimized in a hierarchical setting. We illustrate these points by showing that any optimal policy can be specified with prior beliefs in the context of Bayesian inference. We then show how these prior beliefs are themselves prescribed by an imperative to minimize uncertainty. This formulation explains the saccadic eye movements required to read this text and defines the value of the visual sensations you are soliciting. Frontiers Media S.A. 2012-11-02 /pmc/articles/PMC3487150/ /pubmed/23133414 http://dx.doi.org/10.3389/fnbot.2012.00011 Text en Copyright © 2012 Friston, Adams and Montague. http://www.frontiersin.org/licenseagreement This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.
spellingShingle	Neuroscience Friston, Karl Adams, Rick Montague, Read What is value—accumulated reward or evidence?
title	What is value—accumulated reward or evidence?
title_full	What is value—accumulated reward or evidence?
title_fullStr	What is value—accumulated reward or evidence?
title_full_unstemmed	What is value—accumulated reward or evidence?
title_short	What is value—accumulated reward or evidence?
title_sort	what is value—accumulated reward or evidence?
topic	Neuroscience
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3487150/ https://www.ncbi.nlm.nih.gov/pubmed/23133414 http://dx.doi.org/10.3389/fnbot.2012.00011
work_keys_str_mv	AT fristonkarl whatisvalueaccumulatedrewardorevidence AT adamsrick whatisvalueaccumulatedrewardorevidence AT montagueread whatisvalueaccumulatedrewardorevidence

What is value—accumulated reward or evidence?

Ejemplares similares