Cargando…

Meta-reinforcement learning via orbitofrontal cortex

The meta-reinforcement learning (meta-RL) framework, which involves RL over multiple timescales, has been successful in training deep RL models that generalize to new environments. It has been hypothesized that the prefrontal cortex may mediate meta-RL in the brain, but the evidence is scarce. Here...

Descripción completa

Detalles Bibliográficos
Autores principales: Hattori, Ryoma, Hedrick, Nathan G., Jain, Anant, Chen, Shuqi, You, Hanjia, Hattori, Mariko, Choi, Jun-Hyeok, Lim, Byung Kook, Yasuda, Ryohei, Komiyama, Takaki
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group US 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10689244/
https://www.ncbi.nlm.nih.gov/pubmed/37957318
http://dx.doi.org/10.1038/s41593-023-01485-3
_version_ 1785152330366189568
author Hattori, Ryoma
Hedrick, Nathan G.
Jain, Anant
Chen, Shuqi
You, Hanjia
Hattori, Mariko
Choi, Jun-Hyeok
Lim, Byung Kook
Yasuda, Ryohei
Komiyama, Takaki
author_facet Hattori, Ryoma
Hedrick, Nathan G.
Jain, Anant
Chen, Shuqi
You, Hanjia
Hattori, Mariko
Choi, Jun-Hyeok
Lim, Byung Kook
Yasuda, Ryohei
Komiyama, Takaki
author_sort Hattori, Ryoma
collection PubMed
description The meta-reinforcement learning (meta-RL) framework, which involves RL over multiple timescales, has been successful in training deep RL models that generalize to new environments. It has been hypothesized that the prefrontal cortex may mediate meta-RL in the brain, but the evidence is scarce. Here we show that the orbitofrontal cortex (OFC) mediates meta-RL. We trained mice and deep RL models on a probabilistic reversal learning task across sessions during which they improved their trial-by-trial RL policy through meta-learning. Ca(2+)/calmodulin-dependent protein kinase II-dependent synaptic plasticity in OFC was necessary for this meta-learning but not for the within-session trial-by-trial RL in experts. After meta-learning, OFC activity robustly encoded value signals, and OFC inactivation impaired the RL behaviors. Longitudinal tracking of OFC activity revealed that meta-learning gradually shapes population value coding to guide the ongoing behavioral policy. Our results indicate that two distinct RL algorithms with distinct neural mechanisms and timescales coexist in OFC to support adaptive decision-making.
format Online
Article
Text
id pubmed-10689244
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group US
record_format MEDLINE/PubMed
spelling pubmed-106892442023-12-02 Meta-reinforcement learning via orbitofrontal cortex Hattori, Ryoma Hedrick, Nathan G. Jain, Anant Chen, Shuqi You, Hanjia Hattori, Mariko Choi, Jun-Hyeok Lim, Byung Kook Yasuda, Ryohei Komiyama, Takaki Nat Neurosci Article The meta-reinforcement learning (meta-RL) framework, which involves RL over multiple timescales, has been successful in training deep RL models that generalize to new environments. It has been hypothesized that the prefrontal cortex may mediate meta-RL in the brain, but the evidence is scarce. Here we show that the orbitofrontal cortex (OFC) mediates meta-RL. We trained mice and deep RL models on a probabilistic reversal learning task across sessions during which they improved their trial-by-trial RL policy through meta-learning. Ca(2+)/calmodulin-dependent protein kinase II-dependent synaptic plasticity in OFC was necessary for this meta-learning but not for the within-session trial-by-trial RL in experts. After meta-learning, OFC activity robustly encoded value signals, and OFC inactivation impaired the RL behaviors. Longitudinal tracking of OFC activity revealed that meta-learning gradually shapes population value coding to guide the ongoing behavioral policy. Our results indicate that two distinct RL algorithms with distinct neural mechanisms and timescales coexist in OFC to support adaptive decision-making. Nature Publishing Group US 2023-11-13 2023 /pmc/articles/PMC10689244/ /pubmed/37957318 http://dx.doi.org/10.1038/s41593-023-01485-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Hattori, Ryoma
Hedrick, Nathan G.
Jain, Anant
Chen, Shuqi
You, Hanjia
Hattori, Mariko
Choi, Jun-Hyeok
Lim, Byung Kook
Yasuda, Ryohei
Komiyama, Takaki
Meta-reinforcement learning via orbitofrontal cortex
title Meta-reinforcement learning via orbitofrontal cortex
title_full Meta-reinforcement learning via orbitofrontal cortex
title_fullStr Meta-reinforcement learning via orbitofrontal cortex
title_full_unstemmed Meta-reinforcement learning via orbitofrontal cortex
title_short Meta-reinforcement learning via orbitofrontal cortex
title_sort meta-reinforcement learning via orbitofrontal cortex
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10689244/
https://www.ncbi.nlm.nih.gov/pubmed/37957318
http://dx.doi.org/10.1038/s41593-023-01485-3
work_keys_str_mv AT hattoriryoma metareinforcementlearningviaorbitofrontalcortex
AT hedricknathang metareinforcementlearningviaorbitofrontalcortex
AT jainanant metareinforcementlearningviaorbitofrontalcortex
AT chenshuqi metareinforcementlearningviaorbitofrontalcortex
AT youhanjia metareinforcementlearningviaorbitofrontalcortex
AT hattorimariko metareinforcementlearningviaorbitofrontalcortex
AT choijunhyeok metareinforcementlearningviaorbitofrontalcortex
AT limbyungkook metareinforcementlearningviaorbitofrontalcortex
AT yasudaryohei metareinforcementlearningviaorbitofrontalcortex
AT komiyamatakaki metareinforcementlearningviaorbitofrontalcortex