Cargando…

Human EEG and Recurrent Neural Networks Exhibit Common Temporal Dynamics During Speech Recognition

Recent deep-learning artificial neural networks have shown remarkable success in recognizing natural human speech, however the reasons for their success are not entirely understood. Success of these methods might be because state-of-the-art networks use recurrent layers or dilated convolutional laye...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hashemnia, Saeedeh, Grasse, Lukas, Soni, Shweta, Tata, Matthew S.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2021
Materias:	Neuroscience
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8296978/ https://www.ncbi.nlm.nih.gov/pubmed/34305540 http://dx.doi.org/10.3389/fnsys.2021.617605

_version_	1783725752006475776
author	Hashemnia, Saeedeh Grasse, Lukas Soni, Shweta Tata, Matthew S.
author_facet	Hashemnia, Saeedeh Grasse, Lukas Soni, Shweta Tata, Matthew S.
author_sort	Hashemnia, Saeedeh
collection	PubMed
description	Recent deep-learning artificial neural networks have shown remarkable success in recognizing natural human speech, however the reasons for their success are not entirely understood. Success of these methods might be because state-of-the-art networks use recurrent layers or dilated convolutional layers that enable the network to use a time-dependent feature space. The importance of time-dependent features in human cortical mechanisms of speech perception, measured by electroencephalography (EEG) and magnetoencephalography (MEG), have also been of particular recent interest. It is possible that recurrent neural networks (RNNs) achieve their success by emulating aspects of cortical dynamics, albeit through very different computational mechanisms. In that case, we should observe commonalities in the temporal dynamics of deep-learning models, particularly in recurrent layers, and brain electrical activity (EEG) during speech perception. We explored this prediction by presenting the same sentences to both human listeners and the Deep Speech RNN and considered the temporal dynamics of the EEG and RNN units for identical sentences. We tested whether the recently discovered phenomenon of envelope phase tracking in the human EEG is also evident in RNN hidden layers. We furthermore predicted that the clustering of dissimilarity between model representations of pairs of stimuli would be similar in both RNN and EEG dynamics. We found that the dynamics of both the recurrent layer of the network and human EEG signals exhibit envelope phase tracking with similar time lags. We also computed the representational distance matrices (RDMs) of brain and network responses to speech stimuli. The model RDMs became more similar to the brain RDM when going from early network layers to later ones, and eventually peaked at the recurrent layer. These results suggest that the Deep Speech RNN captures a representation of temporal features of speech in a manner similar to human brain.
format	Online Article Text
id	pubmed-8296978
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-82969782021-07-23 Human EEG and Recurrent Neural Networks Exhibit Common Temporal Dynamics During Speech Recognition Hashemnia, Saeedeh Grasse, Lukas Soni, Shweta Tata, Matthew S. Front Syst Neurosci Neuroscience Recent deep-learning artificial neural networks have shown remarkable success in recognizing natural human speech, however the reasons for their success are not entirely understood. Success of these methods might be because state-of-the-art networks use recurrent layers or dilated convolutional layers that enable the network to use a time-dependent feature space. The importance of time-dependent features in human cortical mechanisms of speech perception, measured by electroencephalography (EEG) and magnetoencephalography (MEG), have also been of particular recent interest. It is possible that recurrent neural networks (RNNs) achieve their success by emulating aspects of cortical dynamics, albeit through very different computational mechanisms. In that case, we should observe commonalities in the temporal dynamics of deep-learning models, particularly in recurrent layers, and brain electrical activity (EEG) during speech perception. We explored this prediction by presenting the same sentences to both human listeners and the Deep Speech RNN and considered the temporal dynamics of the EEG and RNN units for identical sentences. We tested whether the recently discovered phenomenon of envelope phase tracking in the human EEG is also evident in RNN hidden layers. We furthermore predicted that the clustering of dissimilarity between model representations of pairs of stimuli would be similar in both RNN and EEG dynamics. We found that the dynamics of both the recurrent layer of the network and human EEG signals exhibit envelope phase tracking with similar time lags. We also computed the representational distance matrices (RDMs) of brain and network responses to speech stimuli. The model RDMs became more similar to the brain RDM when going from early network layers to later ones, and eventually peaked at the recurrent layer. These results suggest that the Deep Speech RNN captures a representation of temporal features of speech in a manner similar to human brain. Frontiers Media S.A. 2021-07-08 /pmc/articles/PMC8296978/ /pubmed/34305540 http://dx.doi.org/10.3389/fnsys.2021.617605 Text en Copyright © 2021 Hashemnia, Grasse, Soni and Tata. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Neuroscience Hashemnia, Saeedeh Grasse, Lukas Soni, Shweta Tata, Matthew S. Human EEG and Recurrent Neural Networks Exhibit Common Temporal Dynamics During Speech Recognition
title	Human EEG and Recurrent Neural Networks Exhibit Common Temporal Dynamics During Speech Recognition
title_full	Human EEG and Recurrent Neural Networks Exhibit Common Temporal Dynamics During Speech Recognition
title_fullStr	Human EEG and Recurrent Neural Networks Exhibit Common Temporal Dynamics During Speech Recognition
title_full_unstemmed	Human EEG and Recurrent Neural Networks Exhibit Common Temporal Dynamics During Speech Recognition
title_short	Human EEG and Recurrent Neural Networks Exhibit Common Temporal Dynamics During Speech Recognition
title_sort	human eeg and recurrent neural networks exhibit common temporal dynamics during speech recognition
topic	Neuroscience
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8296978/ https://www.ncbi.nlm.nih.gov/pubmed/34305540 http://dx.doi.org/10.3389/fnsys.2021.617605
work_keys_str_mv	AT hashemniasaeedeh humaneegandrecurrentneuralnetworksexhibitcommontemporaldynamicsduringspeechrecognition AT grasselukas humaneegandrecurrentneuralnetworksexhibitcommontemporaldynamicsduringspeechrecognition AT sonishweta humaneegandrecurrentneuralnetworksexhibitcommontemporaldynamicsduringspeechrecognition AT tatamatthews humaneegandrecurrentneuralnetworksexhibitcommontemporaldynamicsduringspeechrecognition

Human EEG and Recurrent Neural Networks Exhibit Common Temporal Dynamics During Speech Recognition

Ejemplares similares