Cargando…
Revisiting Batch Normalization for Training Low-Latency Deep Spiking Neural Networks From Scratch
Spiking Neural Networks (SNNs) have recently emerged as an alternative to deep learning owing to sparse, asynchronous and binary event (or spike) driven processing, that can yield huge energy efficiency benefits on neuromorphic hardware. However, SNNs convey temporally-varying spike activation throu...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8695433/ https://www.ncbi.nlm.nih.gov/pubmed/34955725 http://dx.doi.org/10.3389/fnins.2021.773954 |
_version_ | 1784619576576704512 |
---|---|
author | Kim, Youngeun Panda, Priyadarshini |
author_facet | Kim, Youngeun Panda, Priyadarshini |
author_sort | Kim, Youngeun |
collection | PubMed |
description | Spiking Neural Networks (SNNs) have recently emerged as an alternative to deep learning owing to sparse, asynchronous and binary event (or spike) driven processing, that can yield huge energy efficiency benefits on neuromorphic hardware. However, SNNs convey temporally-varying spike activation through time that is likely to induce a large variation of forward activation and backward gradients, resulting in unstable training. To address this training issue in SNNs, we revisit Batch Normalization (BN) and propose a temporal Batch Normalization Through Time (BNTT) technique. Different from previous BN techniques with SNNs, we find that varying the BN parameters at every time-step allows the model to learn the time-varying input distribution better. Specifically, our proposed BNTT decouples the parameters in a BNTT layer along the time axis to capture the temporal dynamics of spikes. We demonstrate BNTT on CIFAR-10, CIFAR-100, Tiny-ImageNet, event-driven DVS-CIFAR10 datasets, and Sequential MNIST and show near state-of-the-art performance. We conduct comprehensive analysis on the temporal characteristic of BNTT and showcase interesting benefits toward robustness against random and adversarial noise. Further, by monitoring the learnt parameters of BNTT, we find that we can do temporal early exit. That is, we can reduce the inference latency by ~5 − 20 time-steps from the original training latency. The code has been released at https://github.com/Intelligent-Computing-Lab-Yale/BNTT-Batch-Normalization-Through-Time. |
format | Online Article Text |
id | pubmed-8695433 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-86954332021-12-24 Revisiting Batch Normalization for Training Low-Latency Deep Spiking Neural Networks From Scratch Kim, Youngeun Panda, Priyadarshini Front Neurosci Neuroscience Spiking Neural Networks (SNNs) have recently emerged as an alternative to deep learning owing to sparse, asynchronous and binary event (or spike) driven processing, that can yield huge energy efficiency benefits on neuromorphic hardware. However, SNNs convey temporally-varying spike activation through time that is likely to induce a large variation of forward activation and backward gradients, resulting in unstable training. To address this training issue in SNNs, we revisit Batch Normalization (BN) and propose a temporal Batch Normalization Through Time (BNTT) technique. Different from previous BN techniques with SNNs, we find that varying the BN parameters at every time-step allows the model to learn the time-varying input distribution better. Specifically, our proposed BNTT decouples the parameters in a BNTT layer along the time axis to capture the temporal dynamics of spikes. We demonstrate BNTT on CIFAR-10, CIFAR-100, Tiny-ImageNet, event-driven DVS-CIFAR10 datasets, and Sequential MNIST and show near state-of-the-art performance. We conduct comprehensive analysis on the temporal characteristic of BNTT and showcase interesting benefits toward robustness against random and adversarial noise. Further, by monitoring the learnt parameters of BNTT, we find that we can do temporal early exit. That is, we can reduce the inference latency by ~5 − 20 time-steps from the original training latency. The code has been released at https://github.com/Intelligent-Computing-Lab-Yale/BNTT-Batch-Normalization-Through-Time. Frontiers Media S.A. 2021-12-09 /pmc/articles/PMC8695433/ /pubmed/34955725 http://dx.doi.org/10.3389/fnins.2021.773954 Text en Copyright © 2021 Kim and Panda. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Neuroscience Kim, Youngeun Panda, Priyadarshini Revisiting Batch Normalization for Training Low-Latency Deep Spiking Neural Networks From Scratch |
title | Revisiting Batch Normalization for Training Low-Latency Deep Spiking Neural Networks From Scratch |
title_full | Revisiting Batch Normalization for Training Low-Latency Deep Spiking Neural Networks From Scratch |
title_fullStr | Revisiting Batch Normalization for Training Low-Latency Deep Spiking Neural Networks From Scratch |
title_full_unstemmed | Revisiting Batch Normalization for Training Low-Latency Deep Spiking Neural Networks From Scratch |
title_short | Revisiting Batch Normalization for Training Low-Latency Deep Spiking Neural Networks From Scratch |
title_sort | revisiting batch normalization for training low-latency deep spiking neural networks from scratch |
topic | Neuroscience |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8695433/ https://www.ncbi.nlm.nih.gov/pubmed/34955725 http://dx.doi.org/10.3389/fnins.2021.773954 |
work_keys_str_mv | AT kimyoungeun revisitingbatchnormalizationfortraininglowlatencydeepspikingneuralnetworksfromscratch AT pandapriyadarshini revisitingbatchnormalizationfortraininglowlatencydeepspikingneuralnetworksfromscratch |