Cargando…
Few-shot learning: temporal scaling in behavioral and dopaminergic learning
How do we learn associations in the world (e.g., between cues and rewards)? Cue-reward associative learning is controlled in the brain by mesolimbic dopamine(1-4). It is widely believed that dopamine drives such learning by conveying a reward prediction error (RPE) in accordance with temporal differ...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10081323/ https://www.ncbi.nlm.nih.gov/pubmed/37034619 http://dx.doi.org/10.1101/2023.03.31.535173 |
_version_ | 1785021089748877312 |
---|---|
author | Burke, Dennis A Jeong, Huijeong Wu, Brenda Lee, Seul Ah Floeder, Joseph R Namboodiri, Vijay Mohan K |
author_facet | Burke, Dennis A Jeong, Huijeong Wu, Brenda Lee, Seul Ah Floeder, Joseph R Namboodiri, Vijay Mohan K |
author_sort | Burke, Dennis A |
collection | PubMed |
description | How do we learn associations in the world (e.g., between cues and rewards)? Cue-reward associative learning is controlled in the brain by mesolimbic dopamine(1-4). It is widely believed that dopamine drives such learning by conveying a reward prediction error (RPE) in accordance with temporal difference reinforcement learning (TDRL) algorithms(5). TDRL implementations are “trial-based”: learning progresses sequentially across individual cue-outcome experiences. Accordingly, a foundational assumption—often considered a mere truism—is that the more cue-reward pairings one experiences, the more one learns this association. Here, we disprove this assumption, thereby falsifying a foundational principle of trial-based learning algorithms. Specifically, when a group of head-fixed mice received ten times fewer experiences over the same total time as another, a single experience produced as much learning as ten experiences in the other group. This quantitative scaling also holds for mesolimbic dopaminergic learning, with the increase in learning rate being so high that the group with fewer experiences exhibits dopaminergic learning in as few as four cue-reward experiences and behavioral learning in nine. An algorithm implementing reward-triggered retrospective learning explains these findings. The temporal scaling and few-shot learning observed here fundamentally changes our understanding of the neural algorithms of associative learning. |
format | Online Article Text |
id | pubmed-10081323 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-100813232023-04-08 Few-shot learning: temporal scaling in behavioral and dopaminergic learning Burke, Dennis A Jeong, Huijeong Wu, Brenda Lee, Seul Ah Floeder, Joseph R Namboodiri, Vijay Mohan K bioRxiv Article How do we learn associations in the world (e.g., between cues and rewards)? Cue-reward associative learning is controlled in the brain by mesolimbic dopamine(1-4). It is widely believed that dopamine drives such learning by conveying a reward prediction error (RPE) in accordance with temporal difference reinforcement learning (TDRL) algorithms(5). TDRL implementations are “trial-based”: learning progresses sequentially across individual cue-outcome experiences. Accordingly, a foundational assumption—often considered a mere truism—is that the more cue-reward pairings one experiences, the more one learns this association. Here, we disprove this assumption, thereby falsifying a foundational principle of trial-based learning algorithms. Specifically, when a group of head-fixed mice received ten times fewer experiences over the same total time as another, a single experience produced as much learning as ten experiences in the other group. This quantitative scaling also holds for mesolimbic dopaminergic learning, with the increase in learning rate being so high that the group with fewer experiences exhibits dopaminergic learning in as few as four cue-reward experiences and behavioral learning in nine. An algorithm implementing reward-triggered retrospective learning explains these findings. The temporal scaling and few-shot learning observed here fundamentally changes our understanding of the neural algorithms of associative learning. Cold Spring Harbor Laboratory 2023-03-31 /pmc/articles/PMC10081323/ /pubmed/37034619 http://dx.doi.org/10.1101/2023.03.31.535173 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Burke, Dennis A Jeong, Huijeong Wu, Brenda Lee, Seul Ah Floeder, Joseph R Namboodiri, Vijay Mohan K Few-shot learning: temporal scaling in behavioral and dopaminergic learning |
title | Few-shot learning: temporal scaling in behavioral and dopaminergic learning |
title_full | Few-shot learning: temporal scaling in behavioral and dopaminergic learning |
title_fullStr | Few-shot learning: temporal scaling in behavioral and dopaminergic learning |
title_full_unstemmed | Few-shot learning: temporal scaling in behavioral and dopaminergic learning |
title_short | Few-shot learning: temporal scaling in behavioral and dopaminergic learning |
title_sort | few-shot learning: temporal scaling in behavioral and dopaminergic learning |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10081323/ https://www.ncbi.nlm.nih.gov/pubmed/37034619 http://dx.doi.org/10.1101/2023.03.31.535173 |
work_keys_str_mv | AT burkedennisa fewshotlearningtemporalscalinginbehavioralanddopaminergiclearning AT jeonghuijeong fewshotlearningtemporalscalinginbehavioralanddopaminergiclearning AT wubrenda fewshotlearningtemporalscalinginbehavioralanddopaminergiclearning AT leeseulah fewshotlearningtemporalscalinginbehavioralanddopaminergiclearning AT floederjosephr fewshotlearningtemporalscalinginbehavioralanddopaminergiclearning AT namboodirivijaymohank fewshotlearningtemporalscalinginbehavioralanddopaminergiclearning |