Cargando…

Toward robust and scalable deep spiking reinforcement learning

Deep reinforcement learning (DRL) combines reinforcement learning algorithms with deep neural networks (DNNs). Spiking neural networks (SNNs) have been shown to be a biologically plausible and energy efficient alternative to DNNs. Since the introduction of surrogate gradient approaches that allowed...

Descripción completa

Detalles Bibliográficos
Autores principales:	Akl, Mahmoud, Ergene, Deniz, Walter, Florian, Knoll, Alois
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2023
Materias:	Neuroscience
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9894879/ https://www.ncbi.nlm.nih.gov/pubmed/36742191 http://dx.doi.org/10.3389/fnbot.2022.1075647

_version_	1784881827389898752
author	Akl, Mahmoud Ergene, Deniz Walter, Florian Knoll, Alois
author_facet	Akl, Mahmoud Ergene, Deniz Walter, Florian Knoll, Alois
author_sort	Akl, Mahmoud
collection	PubMed
description	Deep reinforcement learning (DRL) combines reinforcement learning algorithms with deep neural networks (DNNs). Spiking neural networks (SNNs) have been shown to be a biologically plausible and energy efficient alternative to DNNs. Since the introduction of surrogate gradient approaches that allowed to overcome the discontinuity in the spike function, SNNs can now be trained with the backpropagation through time (BPTT) algorithm. While largely explored on supervised learning problems, little work has been done on investigating the use of SNNs as function approximators in DRL. Here we show how SNNs can be applied to different DRL algorithms like Deep Q-Network (DQN) and Twin-Delayed Deep Deteministic Policy Gradient (TD3) for discrete and continuous action space environments, respectively. We found that SNNs are sensitive to the additional hyperparameters introduced by spiking neuron models like current and voltage decay factors, firing thresholds, and that extensive hyperparameter tuning is inevitable. However, we show that increasing the simulation time of SNNs, as well as applying a two-neuron encoding to the input observations helps reduce the sensitivity to the membrane parameters. Furthermore, we show that randomizing the membrane parameters, instead of selecting uniform values for all neurons, has stabilizing effects on the training. We conclude that SNNs can be utilized for learning complex continuous control problems with state-of-the-art DRL algorithms. While the training complexity increases, the resulting SNNs can be directly executed on neuromorphic processors and potentially benefit from their high energy efficiency.
format	Online Article Text
id	pubmed-9894879
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-98948792023-02-04 Toward robust and scalable deep spiking reinforcement learning Akl, Mahmoud Ergene, Deniz Walter, Florian Knoll, Alois Front Neurorobot Neuroscience Deep reinforcement learning (DRL) combines reinforcement learning algorithms with deep neural networks (DNNs). Spiking neural networks (SNNs) have been shown to be a biologically plausible and energy efficient alternative to DNNs. Since the introduction of surrogate gradient approaches that allowed to overcome the discontinuity in the spike function, SNNs can now be trained with the backpropagation through time (BPTT) algorithm. While largely explored on supervised learning problems, little work has been done on investigating the use of SNNs as function approximators in DRL. Here we show how SNNs can be applied to different DRL algorithms like Deep Q-Network (DQN) and Twin-Delayed Deep Deteministic Policy Gradient (TD3) for discrete and continuous action space environments, respectively. We found that SNNs are sensitive to the additional hyperparameters introduced by spiking neuron models like current and voltage decay factors, firing thresholds, and that extensive hyperparameter tuning is inevitable. However, we show that increasing the simulation time of SNNs, as well as applying a two-neuron encoding to the input observations helps reduce the sensitivity to the membrane parameters. Furthermore, we show that randomizing the membrane parameters, instead of selecting uniform values for all neurons, has stabilizing effects on the training. We conclude that SNNs can be utilized for learning complex continuous control problems with state-of-the-art DRL algorithms. While the training complexity increases, the resulting SNNs can be directly executed on neuromorphic processors and potentially benefit from their high energy efficiency. Frontiers Media S.A. 2023-01-20 /pmc/articles/PMC9894879/ /pubmed/36742191 http://dx.doi.org/10.3389/fnbot.2022.1075647 Text en Copyright © 2023 Akl, Ergene, Walter and Knoll. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Neuroscience Akl, Mahmoud Ergene, Deniz Walter, Florian Knoll, Alois Toward robust and scalable deep spiking reinforcement learning
title	Toward robust and scalable deep spiking reinforcement learning
title_full	Toward robust and scalable deep spiking reinforcement learning
title_fullStr	Toward robust and scalable deep spiking reinforcement learning
title_full_unstemmed	Toward robust and scalable deep spiking reinforcement learning
title_short	Toward robust and scalable deep spiking reinforcement learning
title_sort	toward robust and scalable deep spiking reinforcement learning
topic	Neuroscience
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9894879/ https://www.ncbi.nlm.nih.gov/pubmed/36742191 http://dx.doi.org/10.3389/fnbot.2022.1075647
work_keys_str_mv	AT aklmahmoud towardrobustandscalabledeepspikingreinforcementlearning AT ergenedeniz towardrobustandscalabledeepspikingreinforcementlearning AT walterflorian towardrobustandscalabledeepspikingreinforcementlearning AT knollalois towardrobustandscalabledeepspikingreinforcementlearning

Toward robust and scalable deep spiking reinforcement learning

Ejemplares similares