Cargando…
Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition
Artificial neural networks (ANN) have become the mainstream acoustic modeling technique for large vocabulary automatic speech recognition (ASR). A conventional ANN features a multi-layer architecture that requires massive amounts of computation. The brain-inspired spiking neural networks (SNN) close...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7090229/ https://www.ncbi.nlm.nih.gov/pubmed/32256308 http://dx.doi.org/10.3389/fnins.2020.00199 |
_version_ | 1783509890434596864 |
---|---|
author | Wu, Jibin Yılmaz, Emre Zhang, Malu Li, Haizhou Tan, Kay Chen |
author_facet | Wu, Jibin Yılmaz, Emre Zhang, Malu Li, Haizhou Tan, Kay Chen |
author_sort | Wu, Jibin |
collection | PubMed |
description | Artificial neural networks (ANN) have become the mainstream acoustic modeling technique for large vocabulary automatic speech recognition (ASR). A conventional ANN features a multi-layer architecture that requires massive amounts of computation. The brain-inspired spiking neural networks (SNN) closely mimic the biological neural networks and can operate on low-power neuromorphic hardware with spike-based computation. Motivated by their unprecedented energy-efficiency and rapid information processing capability, we explore the use of SNNs for speech recognition. In this work, we use SNNs for acoustic modeling and evaluate their performance on several large vocabulary recognition scenarios. The experimental results demonstrate competitive ASR accuracies to their ANN counterparts, while require only 10 algorithmic time steps and as low as 0.68 times total synaptic operations to classify each audio frame. Integrating the algorithmic power of deep SNNs with energy-efficient neuromorphic hardware, therefore, offer an attractive solution for ASR applications running locally on mobile and embedded devices. |
format | Online Article Text |
id | pubmed-7090229 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-70902292020-03-31 Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition Wu, Jibin Yılmaz, Emre Zhang, Malu Li, Haizhou Tan, Kay Chen Front Neurosci Neuroscience Artificial neural networks (ANN) have become the mainstream acoustic modeling technique for large vocabulary automatic speech recognition (ASR). A conventional ANN features a multi-layer architecture that requires massive amounts of computation. The brain-inspired spiking neural networks (SNN) closely mimic the biological neural networks and can operate on low-power neuromorphic hardware with spike-based computation. Motivated by their unprecedented energy-efficiency and rapid information processing capability, we explore the use of SNNs for speech recognition. In this work, we use SNNs for acoustic modeling and evaluate their performance on several large vocabulary recognition scenarios. The experimental results demonstrate competitive ASR accuracies to their ANN counterparts, while require only 10 algorithmic time steps and as low as 0.68 times total synaptic operations to classify each audio frame. Integrating the algorithmic power of deep SNNs with energy-efficient neuromorphic hardware, therefore, offer an attractive solution for ASR applications running locally on mobile and embedded devices. Frontiers Media S.A. 2020-03-17 /pmc/articles/PMC7090229/ /pubmed/32256308 http://dx.doi.org/10.3389/fnins.2020.00199 Text en Copyright © 2020 Wu, Yılmaz, Zhang, Li and Tan. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Neuroscience Wu, Jibin Yılmaz, Emre Zhang, Malu Li, Haizhou Tan, Kay Chen Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition |
title | Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition |
title_full | Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition |
title_fullStr | Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition |
title_full_unstemmed | Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition |
title_short | Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition |
title_sort | deep spiking neural networks for large vocabulary automatic speech recognition |
topic | Neuroscience |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7090229/ https://www.ncbi.nlm.nih.gov/pubmed/32256308 http://dx.doi.org/10.3389/fnins.2020.00199 |
work_keys_str_mv | AT wujibin deepspikingneuralnetworksforlargevocabularyautomaticspeechrecognition AT yılmazemre deepspikingneuralnetworksforlargevocabularyautomaticspeechrecognition AT zhangmalu deepspikingneuralnetworksforlargevocabularyautomaticspeechrecognition AT lihaizhou deepspikingneuralnetworksforlargevocabularyautomaticspeechrecognition AT tankaychen deepspikingneuralnetworksforlargevocabularyautomaticspeechrecognition |