Cargando…

Unsupervised speech recognition through spike-timing-dependent plasticity in a convolutional spiking neural network

Speech recognition (SR) has been improved significantly by artificial neural networks (ANNs), but ANNs have the drawbacks of biologically implausibility and excessive power consumption because of the nonlocal transfer of real-valued errors and weights. While spiking neural networks (SNNs) have the p...

Descripción completa

Detalles Bibliográficos
Autores principales:	Dong, Meng, Huang, Xuhui, Xu, Bo
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2018
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6264808/ https://www.ncbi.nlm.nih.gov/pubmed/30496179 http://dx.doi.org/10.1371/journal.pone.0204596

_version_	1783375569539301376
author	Dong, Meng Huang, Xuhui Xu, Bo
author_facet	Dong, Meng Huang, Xuhui Xu, Bo
author_sort	Dong, Meng
collection	PubMed
description	Speech recognition (SR) has been improved significantly by artificial neural networks (ANNs), but ANNs have the drawbacks of biologically implausibility and excessive power consumption because of the nonlocal transfer of real-valued errors and weights. While spiking neural networks (SNNs) have the potential to solve these drawbacks of ANNs due to their efficient spike communication and their natural way to utilize kinds of synaptic plasticity rules found in brain for weight modification. However, existing SNN models for SR either had bad performance, or were trained in biologically implausible ways. In this paper, we present a biologically inspired convolutional SNN model for SR. The network adopts the time-to-first-spike coding scheme for fast and efficient information processing. A biological learning rule, spike-timing-dependent plasticity (STDP), is used to adjust the synaptic weights of convolutional neurons to form receptive fields in an unsupervised way. In the convolutional structure, the strategy of local weight sharing is introduced and could lead to better feature extraction of speech signals than global weight sharing. We first evaluated the SNN model with a linear support vector machine (SVM) on the TIDIGITS dataset and it got the performance of 97.5%, comparable to the best results of ANNs. Deep analysis on network outputs showed that, not only are the output data more linearly separable, but they also have fewer dimensions and become sparse. To further confirm the validity of our model, we trained it on a more difficult recognition task based on the TIMIT dataset, and it got a high performance of 93.8%. Moreover, a linear spike-based classifier—tempotron—can also achieve high accuracies very close to that of SVM on both the two tasks. These demonstrate that an STDP-based convolutional SNN model equipped with local weight sharing and temporal coding is capable of solving the SR task accurately and efficiently.
format	Online Article Text
id	pubmed-6264808
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-62648082018-12-19 Unsupervised speech recognition through spike-timing-dependent plasticity in a convolutional spiking neural network Dong, Meng Huang, Xuhui Xu, Bo PLoS One Research Article Speech recognition (SR) has been improved significantly by artificial neural networks (ANNs), but ANNs have the drawbacks of biologically implausibility and excessive power consumption because of the nonlocal transfer of real-valued errors and weights. While spiking neural networks (SNNs) have the potential to solve these drawbacks of ANNs due to their efficient spike communication and their natural way to utilize kinds of synaptic plasticity rules found in brain for weight modification. However, existing SNN models for SR either had bad performance, or were trained in biologically implausible ways. In this paper, we present a biologically inspired convolutional SNN model for SR. The network adopts the time-to-first-spike coding scheme for fast and efficient information processing. A biological learning rule, spike-timing-dependent plasticity (STDP), is used to adjust the synaptic weights of convolutional neurons to form receptive fields in an unsupervised way. In the convolutional structure, the strategy of local weight sharing is introduced and could lead to better feature extraction of speech signals than global weight sharing. We first evaluated the SNN model with a linear support vector machine (SVM) on the TIDIGITS dataset and it got the performance of 97.5%, comparable to the best results of ANNs. Deep analysis on network outputs showed that, not only are the output data more linearly separable, but they also have fewer dimensions and become sparse. To further confirm the validity of our model, we trained it on a more difficult recognition task based on the TIMIT dataset, and it got a high performance of 93.8%. Moreover, a linear spike-based classifier—tempotron—can also achieve high accuracies very close to that of SVM on both the two tasks. These demonstrate that an STDP-based convolutional SNN model equipped with local weight sharing and temporal coding is capable of solving the SR task accurately and efficiently. Public Library of Science 2018-11-29 /pmc/articles/PMC6264808/ /pubmed/30496179 http://dx.doi.org/10.1371/journal.pone.0204596 Text en © 2018 Dong et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Dong, Meng Huang, Xuhui Xu, Bo Unsupervised speech recognition through spike-timing-dependent plasticity in a convolutional spiking neural network
title	Unsupervised speech recognition through spike-timing-dependent plasticity in a convolutional spiking neural network
title_full	Unsupervised speech recognition through spike-timing-dependent plasticity in a convolutional spiking neural network
title_fullStr	Unsupervised speech recognition through spike-timing-dependent plasticity in a convolutional spiking neural network
title_full_unstemmed	Unsupervised speech recognition through spike-timing-dependent plasticity in a convolutional spiking neural network
title_short	Unsupervised speech recognition through spike-timing-dependent plasticity in a convolutional spiking neural network
title_sort	unsupervised speech recognition through spike-timing-dependent plasticity in a convolutional spiking neural network
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6264808/ https://www.ncbi.nlm.nih.gov/pubmed/30496179 http://dx.doi.org/10.1371/journal.pone.0204596
work_keys_str_mv	AT dongmeng unsupervisedspeechrecognitionthroughspiketimingdependentplasticityinaconvolutionalspikingneuralnetwork AT huangxuhui unsupervisedspeechrecognitionthroughspiketimingdependentplasticityinaconvolutionalspikingneuralnetwork AT xubo unsupervisedspeechrecognitionthroughspiketimingdependentplasticityinaconvolutionalspikingneuralnetwork

Unsupervised speech recognition through spike-timing-dependent plasticity in a convolutional spiking neural network

Ejemplares similares