Cargando…

Efficient Self-Attention Model for Speech Recognition-Based Assistive Robots Control

Assistive robots are tools that people living with upper body disabilities can leverage to autonomously perform Activities of Daily Living (ADL). Unfortunately, conventional control methods still rely on low-dimensional, easy-to-implement interfaces such as joysticks that tend to be unintuitive and...

Descripción completa

Detalles Bibliográficos
Autores principales:	Poirier, Samuel, Côté-Allard, Ulysse, Routhier, François, Campeau-Lecours, Alexandre
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10347238/ https://www.ncbi.nlm.nih.gov/pubmed/37447906 http://dx.doi.org/10.3390/s23136056

_version_	1785073502986960896
author	Poirier, Samuel Côté-Allard, Ulysse Routhier, François Campeau-Lecours, Alexandre
author_facet	Poirier, Samuel Côté-Allard, Ulysse Routhier, François Campeau-Lecours, Alexandre
author_sort	Poirier, Samuel
collection	PubMed
description	Assistive robots are tools that people living with upper body disabilities can leverage to autonomously perform Activities of Daily Living (ADL). Unfortunately, conventional control methods still rely on low-dimensional, easy-to-implement interfaces such as joysticks that tend to be unintuitive and cumbersome to use. In contrast, vocal commands may represent a viable and intuitive alternative. This work represents an important step toward providing a viable vocal interface for people living with upper limb disabilities by proposing a novel lightweight vocal command recognition system. The proposed model leverages the MobileNet2 architecture, augmenting it with a novel approach to the self-attention mechanism, achieving a new state-of-the-art performance for Keyword Spotting (KWS) on the Google Speech Commands Dataset (GSCD). Moreover, this work presents a new dataset, referred to as the French Speech Commands Dataset (FSCD), comprising 4963 vocal command utterances. Using the GSCD as the source, we used Transfer Learning (TL) to adapt the model to this cross-language task. TL has been shown to significantly improve the model performance on the FSCD. The viability of the proposed approach is further demonstrated through real-life control of a robotic arm by four healthy participants using both the proposed vocal interface and a joystick.
format	Online Article Text
id	pubmed-10347238
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-103472382023-07-15 Efficient Self-Attention Model for Speech Recognition-Based Assistive Robots Control Poirier, Samuel Côté-Allard, Ulysse Routhier, François Campeau-Lecours, Alexandre Sensors (Basel) Article Assistive robots are tools that people living with upper body disabilities can leverage to autonomously perform Activities of Daily Living (ADL). Unfortunately, conventional control methods still rely on low-dimensional, easy-to-implement interfaces such as joysticks that tend to be unintuitive and cumbersome to use. In contrast, vocal commands may represent a viable and intuitive alternative. This work represents an important step toward providing a viable vocal interface for people living with upper limb disabilities by proposing a novel lightweight vocal command recognition system. The proposed model leverages the MobileNet2 architecture, augmenting it with a novel approach to the self-attention mechanism, achieving a new state-of-the-art performance for Keyword Spotting (KWS) on the Google Speech Commands Dataset (GSCD). Moreover, this work presents a new dataset, referred to as the French Speech Commands Dataset (FSCD), comprising 4963 vocal command utterances. Using the GSCD as the source, we used Transfer Learning (TL) to adapt the model to this cross-language task. TL has been shown to significantly improve the model performance on the FSCD. The viability of the proposed approach is further demonstrated through real-life control of a robotic arm by four healthy participants using both the proposed vocal interface and a joystick. MDPI 2023-06-30 /pmc/articles/PMC10347238/ /pubmed/37447906 http://dx.doi.org/10.3390/s23136056 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Poirier, Samuel Côté-Allard, Ulysse Routhier, François Campeau-Lecours, Alexandre Efficient Self-Attention Model for Speech Recognition-Based Assistive Robots Control
title	Efficient Self-Attention Model for Speech Recognition-Based Assistive Robots Control
title_full	Efficient Self-Attention Model for Speech Recognition-Based Assistive Robots Control
title_fullStr	Efficient Self-Attention Model for Speech Recognition-Based Assistive Robots Control
title_full_unstemmed	Efficient Self-Attention Model for Speech Recognition-Based Assistive Robots Control
title_short	Efficient Self-Attention Model for Speech Recognition-Based Assistive Robots Control
title_sort	efficient self-attention model for speech recognition-based assistive robots control
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10347238/ https://www.ncbi.nlm.nih.gov/pubmed/37447906 http://dx.doi.org/10.3390/s23136056
work_keys_str_mv	AT poiriersamuel efficientselfattentionmodelforspeechrecognitionbasedassistiverobotscontrol AT coteallardulysse efficientselfattentionmodelforspeechrecognitionbasedassistiverobotscontrol AT routhierfrancois efficientselfattentionmodelforspeechrecognitionbasedassistiverobotscontrol AT campeaulecoursalexandre efficientselfattentionmodelforspeechrecognitionbasedassistiverobotscontrol

Efficient Self-Attention Model for Speech Recognition-Based Assistive Robots Control

Ejemplares similares