Cargando…

Few-shot learning: temporal scaling in behavioral and dopaminergic learning

How do we learn associations in the world (e.g., between cues and rewards)? Cue-reward associative learning is controlled in the brain by mesolimbic dopamine(1-4). It is widely believed that dopamine drives such learning by conveying a reward prediction error (RPE) in accordance with temporal differ...

Descripción completa

Detalles Bibliográficos
Autores principales: Burke, Dennis A, Jeong, Huijeong, Wu, Brenda, Lee, Seul Ah, Floeder, Joseph R, Namboodiri, Vijay Mohan K
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10081323/
https://www.ncbi.nlm.nih.gov/pubmed/37034619
http://dx.doi.org/10.1101/2023.03.31.535173
_version_ 1785021089748877312
author Burke, Dennis A
Jeong, Huijeong
Wu, Brenda
Lee, Seul Ah
Floeder, Joseph R
Namboodiri, Vijay Mohan K
author_facet Burke, Dennis A
Jeong, Huijeong
Wu, Brenda
Lee, Seul Ah
Floeder, Joseph R
Namboodiri, Vijay Mohan K
author_sort Burke, Dennis A
collection PubMed
description How do we learn associations in the world (e.g., between cues and rewards)? Cue-reward associative learning is controlled in the brain by mesolimbic dopamine(1-4). It is widely believed that dopamine drives such learning by conveying a reward prediction error (RPE) in accordance with temporal difference reinforcement learning (TDRL) algorithms(5). TDRL implementations are “trial-based”: learning progresses sequentially across individual cue-outcome experiences. Accordingly, a foundational assumption—often considered a mere truism—is that the more cue-reward pairings one experiences, the more one learns this association. Here, we disprove this assumption, thereby falsifying a foundational principle of trial-based learning algorithms. Specifically, when a group of head-fixed mice received ten times fewer experiences over the same total time as another, a single experience produced as much learning as ten experiences in the other group. This quantitative scaling also holds for mesolimbic dopaminergic learning, with the increase in learning rate being so high that the group with fewer experiences exhibits dopaminergic learning in as few as four cue-reward experiences and behavioral learning in nine. An algorithm implementing reward-triggered retrospective learning explains these findings. The temporal scaling and few-shot learning observed here fundamentally changes our understanding of the neural algorithms of associative learning.
format Online
Article
Text
id pubmed-10081323
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-100813232023-04-08 Few-shot learning: temporal scaling in behavioral and dopaminergic learning Burke, Dennis A Jeong, Huijeong Wu, Brenda Lee, Seul Ah Floeder, Joseph R Namboodiri, Vijay Mohan K bioRxiv Article How do we learn associations in the world (e.g., between cues and rewards)? Cue-reward associative learning is controlled in the brain by mesolimbic dopamine(1-4). It is widely believed that dopamine drives such learning by conveying a reward prediction error (RPE) in accordance with temporal difference reinforcement learning (TDRL) algorithms(5). TDRL implementations are “trial-based”: learning progresses sequentially across individual cue-outcome experiences. Accordingly, a foundational assumption—often considered a mere truism—is that the more cue-reward pairings one experiences, the more one learns this association. Here, we disprove this assumption, thereby falsifying a foundational principle of trial-based learning algorithms. Specifically, when a group of head-fixed mice received ten times fewer experiences over the same total time as another, a single experience produced as much learning as ten experiences in the other group. This quantitative scaling also holds for mesolimbic dopaminergic learning, with the increase in learning rate being so high that the group with fewer experiences exhibits dopaminergic learning in as few as four cue-reward experiences and behavioral learning in nine. An algorithm implementing reward-triggered retrospective learning explains these findings. The temporal scaling and few-shot learning observed here fundamentally changes our understanding of the neural algorithms of associative learning. Cold Spring Harbor Laboratory 2023-03-31 /pmc/articles/PMC10081323/ /pubmed/37034619 http://dx.doi.org/10.1101/2023.03.31.535173 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Burke, Dennis A
Jeong, Huijeong
Wu, Brenda
Lee, Seul Ah
Floeder, Joseph R
Namboodiri, Vijay Mohan K
Few-shot learning: temporal scaling in behavioral and dopaminergic learning
title Few-shot learning: temporal scaling in behavioral and dopaminergic learning
title_full Few-shot learning: temporal scaling in behavioral and dopaminergic learning
title_fullStr Few-shot learning: temporal scaling in behavioral and dopaminergic learning
title_full_unstemmed Few-shot learning: temporal scaling in behavioral and dopaminergic learning
title_short Few-shot learning: temporal scaling in behavioral and dopaminergic learning
title_sort few-shot learning: temporal scaling in behavioral and dopaminergic learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10081323/
https://www.ncbi.nlm.nih.gov/pubmed/37034619
http://dx.doi.org/10.1101/2023.03.31.535173
work_keys_str_mv AT burkedennisa fewshotlearningtemporalscalinginbehavioralanddopaminergiclearning
AT jeonghuijeong fewshotlearningtemporalscalinginbehavioralanddopaminergiclearning
AT wubrenda fewshotlearningtemporalscalinginbehavioralanddopaminergiclearning
AT leeseulah fewshotlearningtemporalscalinginbehavioralanddopaminergiclearning
AT floederjosephr fewshotlearningtemporalscalinginbehavioralanddopaminergiclearning
AT namboodirivijaymohank fewshotlearningtemporalscalinginbehavioralanddopaminergiclearning