Cargando…

A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback

Reward-modulated spike-timing-dependent plasticity (STDP) has recently emerged as a candidate for a learning rule that could explain how behaviorally relevant adaptive changes in complex networks of spiking neurons could be achieved in a self-organizing manner through local synaptic plasticity. Howe...

Descripción completa

Detalles Bibliográficos
Autores principales:	Legenstein, Robert, Pecevski, Dejan, Maass, Wolfgang
Formato:	Texto
Lenguaje:	English
Publicado:	Public Library of Science 2008
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2543108/ https://www.ncbi.nlm.nih.gov/pubmed/18846203 http://dx.doi.org/10.1371/journal.pcbi.1000180

_version_	1782159175493615616
author	Legenstein, Robert Pecevski, Dejan Maass, Wolfgang
author_facet	Legenstein, Robert Pecevski, Dejan Maass, Wolfgang
author_sort	Legenstein, Robert
collection	PubMed
description	Reward-modulated spike-timing-dependent plasticity (STDP) has recently emerged as a candidate for a learning rule that could explain how behaviorally relevant adaptive changes in complex networks of spiking neurons could be achieved in a self-organizing manner through local synaptic plasticity. However, the capabilities and limitations of this learning rule could so far only be tested through computer simulations. This article provides tools for an analytic treatment of reward-modulated STDP, which allows us to predict under which conditions reward-modulated STDP will achieve a desired learning effect. These analytical results imply that neurons can learn through reward-modulated STDP to classify not only spatial but also temporal firing patterns of presynaptic neurons. They also can learn to respond to specific presynaptic firing patterns with particular spike patterns. Finally, the resulting learning theory predicts that even difficult credit-assignment problems, where it is very hard to tell which synaptic weights should be modified in order to increase the global reward for the system, can be solved in a self-organizing manner through reward-modulated STDP. This yields an explanation for a fundamental experimental result on biofeedback in monkeys by Fetz and Baker. In this experiment monkeys were rewarded for increasing the firing rate of a particular neuron in the cortex and were able to solve this extremely difficult credit assignment problem. Our model for this experiment relies on a combination of reward-modulated STDP with variable spontaneous firing activity. Hence it also provides a possible functional explanation for trial-to-trial variability, which is characteristic for cortical networks of neurons but has no analogue in currently existing artificial computing systems. In addition our model demonstrates that reward-modulated STDP can be applied to all synapses in a large recurrent neural network without endangering the stability of the network dynamics.
format	Text
id	pubmed-2543108
institution	National Center for Biotechnology Information
language	English
publishDate	2008
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-25431082008-10-10 A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback Legenstein, Robert Pecevski, Dejan Maass, Wolfgang PLoS Comput Biol Research Article Reward-modulated spike-timing-dependent plasticity (STDP) has recently emerged as a candidate for a learning rule that could explain how behaviorally relevant adaptive changes in complex networks of spiking neurons could be achieved in a self-organizing manner through local synaptic plasticity. However, the capabilities and limitations of this learning rule could so far only be tested through computer simulations. This article provides tools for an analytic treatment of reward-modulated STDP, which allows us to predict under which conditions reward-modulated STDP will achieve a desired learning effect. These analytical results imply that neurons can learn through reward-modulated STDP to classify not only spatial but also temporal firing patterns of presynaptic neurons. They also can learn to respond to specific presynaptic firing patterns with particular spike patterns. Finally, the resulting learning theory predicts that even difficult credit-assignment problems, where it is very hard to tell which synaptic weights should be modified in order to increase the global reward for the system, can be solved in a self-organizing manner through reward-modulated STDP. This yields an explanation for a fundamental experimental result on biofeedback in monkeys by Fetz and Baker. In this experiment monkeys were rewarded for increasing the firing rate of a particular neuron in the cortex and were able to solve this extremely difficult credit assignment problem. Our model for this experiment relies on a combination of reward-modulated STDP with variable spontaneous firing activity. Hence it also provides a possible functional explanation for trial-to-trial variability, which is characteristic for cortical networks of neurons but has no analogue in currently existing artificial computing systems. In addition our model demonstrates that reward-modulated STDP can be applied to all synapses in a large recurrent neural network without endangering the stability of the network dynamics. Public Library of Science 2008-10-10 /pmc/articles/PMC2543108/ /pubmed/18846203 http://dx.doi.org/10.1371/journal.pcbi.1000180 Text en Legenstein et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article Legenstein, Robert Pecevski, Dejan Maass, Wolfgang A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback
title	A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback
title_full	A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback
title_fullStr	A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback
title_full_unstemmed	A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback
title_short	A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback
title_sort	learning theory for reward-modulated spike-timing-dependent plasticity with application to biofeedback
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2543108/ https://www.ncbi.nlm.nih.gov/pubmed/18846203 http://dx.doi.org/10.1371/journal.pcbi.1000180
work_keys_str_mv	AT legensteinrobert alearningtheoryforrewardmodulatedspiketimingdependentplasticitywithapplicationtobiofeedback AT pecevskidejan alearningtheoryforrewardmodulatedspiketimingdependentplasticitywithapplicationtobiofeedback AT maasswolfgang alearningtheoryforrewardmodulatedspiketimingdependentplasticitywithapplicationtobiofeedback AT legensteinrobert learningtheoryforrewardmodulatedspiketimingdependentplasticitywithapplicationtobiofeedback AT pecevskidejan learningtheoryforrewardmodulatedspiketimingdependentplasticitywithapplicationtobiofeedback AT maasswolfgang learningtheoryforrewardmodulatedspiketimingdependentplasticitywithapplicationtobiofeedback

A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback

Ejemplares similares