Cargando…

Spatio-Temporal Credit Assignment in Neuronal Population Learning

In learning from trial and error, animals need to relate behavioral decisions to environmental reinforcement even though it may be difficult to assign credit to a particular decision when outcomes are uncertain or subject to delays. When considering the biophysical basis of learning, the credit-assi...

Descripción completa

Detalles Bibliográficos
Autores principales: Friedrich, Johannes, Urbanczik, Robert, Senn, Walter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3127803/
https://www.ncbi.nlm.nih.gov/pubmed/21738460
http://dx.doi.org/10.1371/journal.pcbi.1002092
_version_ 1782207375452667904
author Friedrich, Johannes
Urbanczik, Robert
Senn, Walter
author_facet Friedrich, Johannes
Urbanczik, Robert
Senn, Walter
author_sort Friedrich, Johannes
collection PubMed
description In learning from trial and error, animals need to relate behavioral decisions to environmental reinforcement even though it may be difficult to assign credit to a particular decision when outcomes are uncertain or subject to delays. When considering the biophysical basis of learning, the credit-assignment problem is compounded because the behavioral decisions themselves result from the spatio-temporal aggregation of many synaptic releases. We present a model of plasticity induction for reinforcement learning in a population of leaky integrate and fire neurons which is based on a cascade of synaptic memory traces. Each synaptic cascade correlates presynaptic input first with postsynaptic events, next with the behavioral decisions and finally with external reinforcement. For operant conditioning, learning succeeds even when reinforcement is delivered with a delay so large that temporal contiguity between decision and pertinent reward is lost due to intervening decisions which are themselves subject to delayed reinforcement. This shows that the model provides a viable mechanism for temporal credit assignment. Further, learning speeds up with increasing population size, so the plasticity cascade simultaneously addresses the spatial problem of assigning credit to synapses in different population neurons. Simulations on other tasks, such as sequential decision making, serve to contrast the performance of the proposed scheme to that of temporal difference-based learning. We argue that, due to their comparative robustness, synaptic plasticity cascades are attractive basic models of reinforcement learning in the brain.
format Online
Article
Text
id pubmed-3127803
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-31278032011-07-07 Spatio-Temporal Credit Assignment in Neuronal Population Learning Friedrich, Johannes Urbanczik, Robert Senn, Walter PLoS Comput Biol Research Article In learning from trial and error, animals need to relate behavioral decisions to environmental reinforcement even though it may be difficult to assign credit to a particular decision when outcomes are uncertain or subject to delays. When considering the biophysical basis of learning, the credit-assignment problem is compounded because the behavioral decisions themselves result from the spatio-temporal aggregation of many synaptic releases. We present a model of plasticity induction for reinforcement learning in a population of leaky integrate and fire neurons which is based on a cascade of synaptic memory traces. Each synaptic cascade correlates presynaptic input first with postsynaptic events, next with the behavioral decisions and finally with external reinforcement. For operant conditioning, learning succeeds even when reinforcement is delivered with a delay so large that temporal contiguity between decision and pertinent reward is lost due to intervening decisions which are themselves subject to delayed reinforcement. This shows that the model provides a viable mechanism for temporal credit assignment. Further, learning speeds up with increasing population size, so the plasticity cascade simultaneously addresses the spatial problem of assigning credit to synapses in different population neurons. Simulations on other tasks, such as sequential decision making, serve to contrast the performance of the proposed scheme to that of temporal difference-based learning. We argue that, due to their comparative robustness, synaptic plasticity cascades are attractive basic models of reinforcement learning in the brain. Public Library of Science 2011-06-30 /pmc/articles/PMC3127803/ /pubmed/21738460 http://dx.doi.org/10.1371/journal.pcbi.1002092 Text en Friedrich et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Friedrich, Johannes
Urbanczik, Robert
Senn, Walter
Spatio-Temporal Credit Assignment in Neuronal Population Learning
title Spatio-Temporal Credit Assignment in Neuronal Population Learning
title_full Spatio-Temporal Credit Assignment in Neuronal Population Learning
title_fullStr Spatio-Temporal Credit Assignment in Neuronal Population Learning
title_full_unstemmed Spatio-Temporal Credit Assignment in Neuronal Population Learning
title_short Spatio-Temporal Credit Assignment in Neuronal Population Learning
title_sort spatio-temporal credit assignment in neuronal population learning
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3127803/
https://www.ncbi.nlm.nih.gov/pubmed/21738460
http://dx.doi.org/10.1371/journal.pcbi.1002092
work_keys_str_mv AT friedrichjohannes spatiotemporalcreditassignmentinneuronalpopulationlearning
AT urbanczikrobert spatiotemporalcreditassignmentinneuronalpopulationlearning
AT sennwalter spatiotemporalcreditassignmentinneuronalpopulationlearning