Cargando…

Coexistence of Reward and Unsupervised Learning During the Operant Conditioning of Neural Firing Rates

A fundamental goal of neuroscience is to understand how cognitive processes, such as operant conditioning, are performed by the brain. Typical and well studied examples of operant conditioning, in which the firing rates of individual cortical neurons in monkeys are increased using rewards, provide a...

Descripción completa

Detalles Bibliográficos
Autores principales: Kerr, Robert R., Grayden, David B., Thomas, Doreen A., Gilson, Matthieu, Burkitt, Anthony N.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3903641/
https://www.ncbi.nlm.nih.gov/pubmed/24475240
http://dx.doi.org/10.1371/journal.pone.0087123
_version_ 1782301128640167936
author Kerr, Robert R.
Grayden, David B.
Thomas, Doreen A.
Gilson, Matthieu
Burkitt, Anthony N.
author_facet Kerr, Robert R.
Grayden, David B.
Thomas, Doreen A.
Gilson, Matthieu
Burkitt, Anthony N.
author_sort Kerr, Robert R.
collection PubMed
description A fundamental goal of neuroscience is to understand how cognitive processes, such as operant conditioning, are performed by the brain. Typical and well studied examples of operant conditioning, in which the firing rates of individual cortical neurons in monkeys are increased using rewards, provide an opportunity for insight into this. Studies of reward-modulated spike-timing-dependent plasticity (RSTDP), and of other models such as R-max, have reproduced this learning behavior, but they have assumed that no unsupervised learning is present (i.e., no learning occurs without, or independent of, rewards). We show that these models cannot elicit firing rate reinforcement while exhibiting both reward learning and ongoing, stable unsupervised learning. To fix this issue, we propose a new RSTDP model of synaptic plasticity based upon the observed effects that dopamine has on long-term potentiation and depression (LTP and LTD). We show, both analytically and through simulations, that our new model can exhibit unsupervised learning and lead to firing rate reinforcement. This requires that the strengthening of LTP by the reward signal is greater than the strengthening of LTD and that the reinforced neuron exhibits irregular firing. We show the robustness of our findings to spike-timing correlations, to the synaptic weight dependence that is assumed, and to changes in the mean reward. We also consider our model in the differential reinforcement of two nearby neurons. Our model aligns more strongly with experimental studies than previous models and makes testable predictions for future experiments.
format Online
Article
Text
id pubmed-3903641
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-39036412014-01-28 Coexistence of Reward and Unsupervised Learning During the Operant Conditioning of Neural Firing Rates Kerr, Robert R. Grayden, David B. Thomas, Doreen A. Gilson, Matthieu Burkitt, Anthony N. PLoS One Research Article A fundamental goal of neuroscience is to understand how cognitive processes, such as operant conditioning, are performed by the brain. Typical and well studied examples of operant conditioning, in which the firing rates of individual cortical neurons in monkeys are increased using rewards, provide an opportunity for insight into this. Studies of reward-modulated spike-timing-dependent plasticity (RSTDP), and of other models such as R-max, have reproduced this learning behavior, but they have assumed that no unsupervised learning is present (i.e., no learning occurs without, or independent of, rewards). We show that these models cannot elicit firing rate reinforcement while exhibiting both reward learning and ongoing, stable unsupervised learning. To fix this issue, we propose a new RSTDP model of synaptic plasticity based upon the observed effects that dopamine has on long-term potentiation and depression (LTP and LTD). We show, both analytically and through simulations, that our new model can exhibit unsupervised learning and lead to firing rate reinforcement. This requires that the strengthening of LTP by the reward signal is greater than the strengthening of LTD and that the reinforced neuron exhibits irregular firing. We show the robustness of our findings to spike-timing correlations, to the synaptic weight dependence that is assumed, and to changes in the mean reward. We also consider our model in the differential reinforcement of two nearby neurons. Our model aligns more strongly with experimental studies than previous models and makes testable predictions for future experiments. Public Library of Science 2014-01-27 /pmc/articles/PMC3903641/ /pubmed/24475240 http://dx.doi.org/10.1371/journal.pone.0087123 Text en © 2014 Kerr et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Kerr, Robert R.
Grayden, David B.
Thomas, Doreen A.
Gilson, Matthieu
Burkitt, Anthony N.
Coexistence of Reward and Unsupervised Learning During the Operant Conditioning of Neural Firing Rates
title Coexistence of Reward and Unsupervised Learning During the Operant Conditioning of Neural Firing Rates
title_full Coexistence of Reward and Unsupervised Learning During the Operant Conditioning of Neural Firing Rates
title_fullStr Coexistence of Reward and Unsupervised Learning During the Operant Conditioning of Neural Firing Rates
title_full_unstemmed Coexistence of Reward and Unsupervised Learning During the Operant Conditioning of Neural Firing Rates
title_short Coexistence of Reward and Unsupervised Learning During the Operant Conditioning of Neural Firing Rates
title_sort coexistence of reward and unsupervised learning during the operant conditioning of neural firing rates
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3903641/
https://www.ncbi.nlm.nih.gov/pubmed/24475240
http://dx.doi.org/10.1371/journal.pone.0087123
work_keys_str_mv AT kerrrobertr coexistenceofrewardandunsupervisedlearningduringtheoperantconditioningofneuralfiringrates
AT graydendavidb coexistenceofrewardandunsupervisedlearningduringtheoperantconditioningofneuralfiringrates
AT thomasdoreena coexistenceofrewardandunsupervisedlearningduringtheoperantconditioningofneuralfiringrates
AT gilsonmatthieu coexistenceofrewardandunsupervisedlearningduringtheoperantconditioningofneuralfiringrates
AT burkittanthonyn coexistenceofrewardandunsupervisedlearningduringtheoperantconditioningofneuralfiringrates