Cargando…

Active Inference, epistemic value, and vicarious trial and error

Balancing habitual and deliberate forms of choice entails a comparison of their respective merits—the former being faster but inflexible, and the latter slower but more versatile. Here, we show that arbitration between these two forms of control can be derived from first principles within an Active...

Descripción completa

Detalles Bibliográficos
Autores principales: Pezzulo, Giovanni, Cartoni, Emilio, Rigoli, Francesco, Pio-Lopez, Léo, Friston, Karl
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4918783/
https://www.ncbi.nlm.nih.gov/pubmed/27317193
http://dx.doi.org/10.1101/lm.041780.116
_version_ 1782439161067732992
author Pezzulo, Giovanni
Cartoni, Emilio
Rigoli, Francesco
Pio-Lopez, Léo
Friston, Karl
author_facet Pezzulo, Giovanni
Cartoni, Emilio
Rigoli, Francesco
Pio-Lopez, Léo
Friston, Karl
author_sort Pezzulo, Giovanni
collection PubMed
description Balancing habitual and deliberate forms of choice entails a comparison of their respective merits—the former being faster but inflexible, and the latter slower but more versatile. Here, we show that arbitration between these two forms of control can be derived from first principles within an Active Inference scheme. We illustrate our arguments with simulations that reproduce rodent spatial decisions in T-mazes. In this context, deliberation has been associated with vicarious trial and error (VTE) behavior (i.e., the fact that rodents sometimes stop at decision points as if deliberating between choice alternatives), whose neurophysiological correlates are “forward sweeps” of hippocampal place cells in the arms of the maze under consideration. Crucially, forward sweeps arise early in learning and disappear shortly after, marking a transition from deliberative to habitual choice. Our simulations show that this transition emerges as the optimal solution to the trade-off between policies that maximize reward or extrinsic value (habitual policies) and those that also consider the epistemic value of exploratory behavior (deliberative or epistemic policies)—the latter requiring VTE and the retrieval of episodic information via forward sweeps. We thus offer a novel perspective on the optimality principles that engender forward sweeps and VTE, and on their role on deliberate choice.
format Online
Article
Text
id pubmed-4918783
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-49187832016-07-07 Active Inference, epistemic value, and vicarious trial and error Pezzulo, Giovanni Cartoni, Emilio Rigoli, Francesco Pio-Lopez, Léo Friston, Karl Learn Mem Research Balancing habitual and deliberate forms of choice entails a comparison of their respective merits—the former being faster but inflexible, and the latter slower but more versatile. Here, we show that arbitration between these two forms of control can be derived from first principles within an Active Inference scheme. We illustrate our arguments with simulations that reproduce rodent spatial decisions in T-mazes. In this context, deliberation has been associated with vicarious trial and error (VTE) behavior (i.e., the fact that rodents sometimes stop at decision points as if deliberating between choice alternatives), whose neurophysiological correlates are “forward sweeps” of hippocampal place cells in the arms of the maze under consideration. Crucially, forward sweeps arise early in learning and disappear shortly after, marking a transition from deliberative to habitual choice. Our simulations show that this transition emerges as the optimal solution to the trade-off between policies that maximize reward or extrinsic value (habitual policies) and those that also consider the epistemic value of exploratory behavior (deliberative or epistemic policies)—the latter requiring VTE and the retrieval of episodic information via forward sweeps. We thus offer a novel perspective on the optimality principles that engender forward sweeps and VTE, and on their role on deliberate choice. Cold Spring Harbor Laboratory Press 2016-07 /pmc/articles/PMC4918783/ /pubmed/27317193 http://dx.doi.org/10.1101/lm.041780.116 Text en © 2016 Pezzulo et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by/4.0/ This article, published in Learning & Memory, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.
spellingShingle Research
Pezzulo, Giovanni
Cartoni, Emilio
Rigoli, Francesco
Pio-Lopez, Léo
Friston, Karl
Active Inference, epistemic value, and vicarious trial and error
title Active Inference, epistemic value, and vicarious trial and error
title_full Active Inference, epistemic value, and vicarious trial and error
title_fullStr Active Inference, epistemic value, and vicarious trial and error
title_full_unstemmed Active Inference, epistemic value, and vicarious trial and error
title_short Active Inference, epistemic value, and vicarious trial and error
title_sort active inference, epistemic value, and vicarious trial and error
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4918783/
https://www.ncbi.nlm.nih.gov/pubmed/27317193
http://dx.doi.org/10.1101/lm.041780.116
work_keys_str_mv AT pezzulogiovanni activeinferenceepistemicvalueandvicarioustrialanderror
AT cartoniemilio activeinferenceepistemicvalueandvicarioustrialanderror
AT rigolifrancesco activeinferenceepistemicvalueandvicarioustrialanderror
AT piolopezleo activeinferenceepistemicvalueandvicarioustrialanderror
AT fristonkarl activeinferenceepistemicvalueandvicarioustrialanderror