Cargando…

Decoding of the speech envelope from EEG using the VLAAI deep neural network

To investigate the processing of speech in the brain, commonly simple linear models are used to establish a relationship between brain signals and speech features. However, these linear models are ill-equipped to model a highly-dynamic, complex non-linear system like the brain, and they often requir...

Descripción completa

Detalles Bibliográficos
Autores principales: Accou, Bernd, Vanthornhout, Jonas, hamme, Hugo Van, Francart, Tom
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9842721/
https://www.ncbi.nlm.nih.gov/pubmed/36646740
http://dx.doi.org/10.1038/s41598-022-27332-2
_version_ 1784870207084298240
author Accou, Bernd
Vanthornhout, Jonas
hamme, Hugo Van
Francart, Tom
author_facet Accou, Bernd
Vanthornhout, Jonas
hamme, Hugo Van
Francart, Tom
author_sort Accou, Bernd
collection PubMed
description To investigate the processing of speech in the brain, commonly simple linear models are used to establish a relationship between brain signals and speech features. However, these linear models are ill-equipped to model a highly-dynamic, complex non-linear system like the brain, and they often require a substantial amount of subject-specific training data. This work introduces a novel speech decoder architecture: the Very Large Augmented Auditory Inference (VLAAI) network. The VLAAI network outperformed state-of-the-art subject-independent models (median Pearson correlation of 0.19, p < 0.001), yielding an increase over the well-established linear model by 52%. Using ablation techniques, we identified the relative importance of each part of the VLAAI network and found that the non-linear components and output context module influenced model performance the most (10% relative performance increase). Subsequently, the VLAAI network was evaluated on a holdout dataset of 26 subjects and a publicly available unseen dataset to test generalization for unseen subjects and stimuli. No significant difference was found between the default test and the holdout subjects, and between the default test set and the public dataset. The VLAAI network also significantly outperformed all baseline models on the public dataset. We evaluated the effect of training set size by training the VLAAI network on data from 1 up to 80 subjects and evaluated on 26 holdout subjects, revealing a relationship following a hyperbolic tangent function between the number of subjects in the training set and the performance on unseen subjects. Finally, the subject-independent VLAAI network was finetuned for 26 holdout subjects to obtain subject-specific VLAAI models. With 5 minutes of data or more, a significant performance improvement was found, up to 34% (from 0.18 to 0.25 median Pearson correlation) with regards to the subject-independent VLAAI network.
format Online
Article
Text
id pubmed-9842721
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-98427212023-01-18 Decoding of the speech envelope from EEG using the VLAAI deep neural network Accou, Bernd Vanthornhout, Jonas hamme, Hugo Van Francart, Tom Sci Rep Article To investigate the processing of speech in the brain, commonly simple linear models are used to establish a relationship between brain signals and speech features. However, these linear models are ill-equipped to model a highly-dynamic, complex non-linear system like the brain, and they often require a substantial amount of subject-specific training data. This work introduces a novel speech decoder architecture: the Very Large Augmented Auditory Inference (VLAAI) network. The VLAAI network outperformed state-of-the-art subject-independent models (median Pearson correlation of 0.19, p < 0.001), yielding an increase over the well-established linear model by 52%. Using ablation techniques, we identified the relative importance of each part of the VLAAI network and found that the non-linear components and output context module influenced model performance the most (10% relative performance increase). Subsequently, the VLAAI network was evaluated on a holdout dataset of 26 subjects and a publicly available unseen dataset to test generalization for unseen subjects and stimuli. No significant difference was found between the default test and the holdout subjects, and between the default test set and the public dataset. The VLAAI network also significantly outperformed all baseline models on the public dataset. We evaluated the effect of training set size by training the VLAAI network on data from 1 up to 80 subjects and evaluated on 26 holdout subjects, revealing a relationship following a hyperbolic tangent function between the number of subjects in the training set and the performance on unseen subjects. Finally, the subject-independent VLAAI network was finetuned for 26 holdout subjects to obtain subject-specific VLAAI models. With 5 minutes of data or more, a significant performance improvement was found, up to 34% (from 0.18 to 0.25 median Pearson correlation) with regards to the subject-independent VLAAI network. Nature Publishing Group UK 2023-01-16 /pmc/articles/PMC9842721/ /pubmed/36646740 http://dx.doi.org/10.1038/s41598-022-27332-2 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Accou, Bernd
Vanthornhout, Jonas
hamme, Hugo Van
Francart, Tom
Decoding of the speech envelope from EEG using the VLAAI deep neural network
title Decoding of the speech envelope from EEG using the VLAAI deep neural network
title_full Decoding of the speech envelope from EEG using the VLAAI deep neural network
title_fullStr Decoding of the speech envelope from EEG using the VLAAI deep neural network
title_full_unstemmed Decoding of the speech envelope from EEG using the VLAAI deep neural network
title_short Decoding of the speech envelope from EEG using the VLAAI deep neural network
title_sort decoding of the speech envelope from eeg using the vlaai deep neural network
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9842721/
https://www.ncbi.nlm.nih.gov/pubmed/36646740
http://dx.doi.org/10.1038/s41598-022-27332-2
work_keys_str_mv AT accoubernd decodingofthespeechenvelopefromeegusingthevlaaideepneuralnetwork
AT vanthornhoutjonas decodingofthespeechenvelopefromeegusingthevlaaideepneuralnetwork
AT hammehugovan decodingofthespeechenvelopefromeegusingthevlaaideepneuralnetwork
AT francarttom decodingofthespeechenvelopefromeegusingthevlaaideepneuralnetwork