Cargando…

NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals

Neural speech decoding-driven brain-computer interface (BCI) or speech-BCI is a novel paradigm for exploring communication restoration for locked-in (fully paralyzed but aware) patients. Speech-BCIs aim to map a direct transformation from neural signals to text or speech, which has the potential for...

Descripción completa

Detalles Bibliográficos
Autores principales:	Dash, Debadatta, Ferrari, Paul, Dutta, Satwik, Wang, Jun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7218843/ https://www.ncbi.nlm.nih.gov/pubmed/32316162 http://dx.doi.org/10.3390/s20082248

_version_	1783532873915039744
author	Dash, Debadatta Ferrari, Paul Dutta, Satwik Wang, Jun
author_facet	Dash, Debadatta Ferrari, Paul Dutta, Satwik Wang, Jun
author_sort	Dash, Debadatta
collection	PubMed
description	Neural speech decoding-driven brain-computer interface (BCI) or speech-BCI is a novel paradigm for exploring communication restoration for locked-in (fully paralyzed but aware) patients. Speech-BCIs aim to map a direct transformation from neural signals to text or speech, which has the potential for a higher communication rate than the current BCIs. Although recent progress has demonstrated the potential of speech-BCIs from either invasive or non-invasive neural signals, the majority of the systems developed so far still assume knowing the onset and offset of the speech utterances within the continuous neural recordings. This lack of real-time voice/speech activity detection (VAD) is a current obstacle for future applications of neural speech decoding wherein BCI users can have a continuous conversation with other speakers. To address this issue, in this study, we attempted to automatically detect the voice/speech activity directly from the neural signals recorded using magnetoencephalography (MEG). First, we classified the whole segments of pre-speech, speech, and post-speech in the neural signals using a support vector machine (SVM). Second, for continuous prediction, we used a long short-term memory-recurrent neural network (LSTM-RNN) to efficiently decode the voice activity at each time point via its sequential pattern-learning mechanism. Experimental results demonstrated the possibility of real-time VAD directly from the non-invasive neural signals with about 88% accuracy.
format	Online Article Text
id	pubmed-7218843
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-72188432020-05-22 NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals Dash, Debadatta Ferrari, Paul Dutta, Satwik Wang, Jun Sensors (Basel) Article Neural speech decoding-driven brain-computer interface (BCI) or speech-BCI is a novel paradigm for exploring communication restoration for locked-in (fully paralyzed but aware) patients. Speech-BCIs aim to map a direct transformation from neural signals to text or speech, which has the potential for a higher communication rate than the current BCIs. Although recent progress has demonstrated the potential of speech-BCIs from either invasive or non-invasive neural signals, the majority of the systems developed so far still assume knowing the onset and offset of the speech utterances within the continuous neural recordings. This lack of real-time voice/speech activity detection (VAD) is a current obstacle for future applications of neural speech decoding wherein BCI users can have a continuous conversation with other speakers. To address this issue, in this study, we attempted to automatically detect the voice/speech activity directly from the neural signals recorded using magnetoencephalography (MEG). First, we classified the whole segments of pre-speech, speech, and post-speech in the neural signals using a support vector machine (SVM). Second, for continuous prediction, we used a long short-term memory-recurrent neural network (LSTM-RNN) to efficiently decode the voice activity at each time point via its sequential pattern-learning mechanism. Experimental results demonstrated the possibility of real-time VAD directly from the non-invasive neural signals with about 88% accuracy. MDPI 2020-04-16 /pmc/articles/PMC7218843/ /pubmed/32316162 http://dx.doi.org/10.3390/s20082248 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Dash, Debadatta Ferrari, Paul Dutta, Satwik Wang, Jun NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals
title	NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals
title_full	NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals
title_fullStr	NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals
title_full_unstemmed	NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals
title_short	NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals
title_sort	neurovad: real-time voice activity detection from non-invasive neuromagnetic signals
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7218843/ https://www.ncbi.nlm.nih.gov/pubmed/32316162 http://dx.doi.org/10.3390/s20082248
work_keys_str_mv	AT dashdebadatta neurovadrealtimevoiceactivitydetectionfromnoninvasiveneuromagneticsignals AT ferraripaul neurovadrealtimevoiceactivitydetectionfromnoninvasiveneuromagneticsignals AT duttasatwik neurovadrealtimevoiceactivitydetectionfromnoninvasiveneuromagneticsignals AT wangjun neurovadrealtimevoiceactivitydetectionfromnoninvasiveneuromagneticsignals

NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals

Ejemplares similares