Cargando…

Few-shot learning: temporal scaling in behavioral and dopaminergic learning

How do we learn associations in the world (e.g., between cues and rewards)? Cue-reward associative learning is controlled in the brain by mesolimbic dopamine(1-4). It is widely believed that dopamine drives such learning by conveying a reward prediction error (RPE) in accordance with temporal differ...

Descripción completa

Detalles Bibliográficos
Autores principales: Burke, Dennis A, Jeong, Huijeong, Wu, Brenda, Lee, Seul Ah, Floeder, Joseph R, Namboodiri, Vijay Mohan K
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10081323/
https://www.ncbi.nlm.nih.gov/pubmed/37034619
http://dx.doi.org/10.1101/2023.03.31.535173
Descripción
Sumario:How do we learn associations in the world (e.g., between cues and rewards)? Cue-reward associative learning is controlled in the brain by mesolimbic dopamine(1-4). It is widely believed that dopamine drives such learning by conveying a reward prediction error (RPE) in accordance with temporal difference reinforcement learning (TDRL) algorithms(5). TDRL implementations are “trial-based”: learning progresses sequentially across individual cue-outcome experiences. Accordingly, a foundational assumption—often considered a mere truism—is that the more cue-reward pairings one experiences, the more one learns this association. Here, we disprove this assumption, thereby falsifying a foundational principle of trial-based learning algorithms. Specifically, when a group of head-fixed mice received ten times fewer experiences over the same total time as another, a single experience produced as much learning as ten experiences in the other group. This quantitative scaling also holds for mesolimbic dopaminergic learning, with the increase in learning rate being so high that the group with fewer experiences exhibits dopaminergic learning in as few as four cue-reward experiences and behavioral learning in nine. An algorithm implementing reward-triggered retrospective learning explains these findings. The temporal scaling and few-shot learning observed here fundamentally changes our understanding of the neural algorithms of associative learning.