Cargando…

A One-Shot Shift from Explore to Exploit in Monkey Prefrontal Cortex

Much animal learning is slow, with cumulative changes in behavior driven by reward prediction errors. When the abstract structure of a problem is known, however, both animals and formal learning models can rapidly attach new items to their roles within this structure, sometimes in a single trial. Fr...

Descripción completa

Detalles Bibliográficos
Autores principales: Achterberg, Jascha, Kadohisa, Mikiko, Watanabe, Kei, Kusunoki, Makoto, Buckley, Mark J., Duncan, John
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Society for Neuroscience 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8802942/
https://www.ncbi.nlm.nih.gov/pubmed/34782437
http://dx.doi.org/10.1523/JNEUROSCI.1338-21.2021
_version_ 1784642772380155904
author Achterberg, Jascha
Kadohisa, Mikiko
Watanabe, Kei
Kusunoki, Makoto
Buckley, Mark J.
Duncan, John
author_facet Achterberg, Jascha
Kadohisa, Mikiko
Watanabe, Kei
Kusunoki, Makoto
Buckley, Mark J.
Duncan, John
author_sort Achterberg, Jascha
collection PubMed
description Much animal learning is slow, with cumulative changes in behavior driven by reward prediction errors. When the abstract structure of a problem is known, however, both animals and formal learning models can rapidly attach new items to their roles within this structure, sometimes in a single trial. Frontal cortex is likely to play a key role in this process. To examine information seeking and use in a known problem structure, we trained monkeys in an explore/exploit task, requiring the animal first to test objects for their association with reward, then, once rewarded objects were found, to reselect them on further trials for further rewards. Many cells in the frontal cortex showed an explore/exploit preference aligned with one-shot learning in the monkeys' behavior: the population switched from an explore state to an exploit state after a single trial of learning but partially maintained the explore state if an error indicated that learning had failed. Binary switch from explore to exploit was not explained by continuous changes linked to expectancy or prediction error. Explore/exploit preferences were independent for two stages of the trial: object selection and receipt of feedback. Within an established task structure, frontal activity may control the separate processes of explore and exploit, switching in one trial between the two. SIGNIFICANCE STATEMENT Much animal learning is slow, with cumulative changes in behavior driven by reward prediction errors. When the abstract structure a problem is known, however, both animals and formal learning models can rapidly attach new items to their roles within this structure. To address transitions in neural activity during one-shot learning, we trained monkeys in an explore/exploit task using familiar objects and a highly familiar task structure. When learning was rapid, many frontal neurons showed a binary, one-shot switch between explore and exploit. Within an established task structure, frontal activity may control the separate operations of exploring alternative objects to establish their current role, then exploiting this knowledge for further reward.
format Online
Article
Text
id pubmed-8802942
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Society for Neuroscience
record_format MEDLINE/PubMed
spelling pubmed-88029422022-02-02 A One-Shot Shift from Explore to Exploit in Monkey Prefrontal Cortex Achterberg, Jascha Kadohisa, Mikiko Watanabe, Kei Kusunoki, Makoto Buckley, Mark J. Duncan, John J Neurosci Research Articles Much animal learning is slow, with cumulative changes in behavior driven by reward prediction errors. When the abstract structure of a problem is known, however, both animals and formal learning models can rapidly attach new items to their roles within this structure, sometimes in a single trial. Frontal cortex is likely to play a key role in this process. To examine information seeking and use in a known problem structure, we trained monkeys in an explore/exploit task, requiring the animal first to test objects for their association with reward, then, once rewarded objects were found, to reselect them on further trials for further rewards. Many cells in the frontal cortex showed an explore/exploit preference aligned with one-shot learning in the monkeys' behavior: the population switched from an explore state to an exploit state after a single trial of learning but partially maintained the explore state if an error indicated that learning had failed. Binary switch from explore to exploit was not explained by continuous changes linked to expectancy or prediction error. Explore/exploit preferences were independent for two stages of the trial: object selection and receipt of feedback. Within an established task structure, frontal activity may control the separate processes of explore and exploit, switching in one trial between the two. SIGNIFICANCE STATEMENT Much animal learning is slow, with cumulative changes in behavior driven by reward prediction errors. When the abstract structure a problem is known, however, both animals and formal learning models can rapidly attach new items to their roles within this structure. To address transitions in neural activity during one-shot learning, we trained monkeys in an explore/exploit task using familiar objects and a highly familiar task structure. When learning was rapid, many frontal neurons showed a binary, one-shot switch between explore and exploit. Within an established task structure, frontal activity may control the separate operations of exploring alternative objects to establish their current role, then exploiting this knowledge for further reward. Society for Neuroscience 2022-01-12 /pmc/articles/PMC8802942/ /pubmed/34782437 http://dx.doi.org/10.1523/JNEUROSCI.1338-21.2021 Text en Copyright © 2022 Achterberg et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.
spellingShingle Research Articles
Achterberg, Jascha
Kadohisa, Mikiko
Watanabe, Kei
Kusunoki, Makoto
Buckley, Mark J.
Duncan, John
A One-Shot Shift from Explore to Exploit in Monkey Prefrontal Cortex
title A One-Shot Shift from Explore to Exploit in Monkey Prefrontal Cortex
title_full A One-Shot Shift from Explore to Exploit in Monkey Prefrontal Cortex
title_fullStr A One-Shot Shift from Explore to Exploit in Monkey Prefrontal Cortex
title_full_unstemmed A One-Shot Shift from Explore to Exploit in Monkey Prefrontal Cortex
title_short A One-Shot Shift from Explore to Exploit in Monkey Prefrontal Cortex
title_sort one-shot shift from explore to exploit in monkey prefrontal cortex
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8802942/
https://www.ncbi.nlm.nih.gov/pubmed/34782437
http://dx.doi.org/10.1523/JNEUROSCI.1338-21.2021
work_keys_str_mv AT achterbergjascha aoneshotshiftfromexploretoexploitinmonkeyprefrontalcortex
AT kadohisamikiko aoneshotshiftfromexploretoexploitinmonkeyprefrontalcortex
AT watanabekei aoneshotshiftfromexploretoexploitinmonkeyprefrontalcortex
AT kusunokimakoto aoneshotshiftfromexploretoexploitinmonkeyprefrontalcortex
AT buckleymarkj aoneshotshiftfromexploretoexploitinmonkeyprefrontalcortex
AT duncanjohn aoneshotshiftfromexploretoexploitinmonkeyprefrontalcortex
AT achterbergjascha oneshotshiftfromexploretoexploitinmonkeyprefrontalcortex
AT kadohisamikiko oneshotshiftfromexploretoexploitinmonkeyprefrontalcortex
AT watanabekei oneshotshiftfromexploretoexploitinmonkeyprefrontalcortex
AT kusunokimakoto oneshotshiftfromexploretoexploitinmonkeyprefrontalcortex
AT buckleymarkj oneshotshiftfromexploretoexploitinmonkeyprefrontalcortex
AT duncanjohn oneshotshiftfromexploretoexploitinmonkeyprefrontalcortex