Cargando…

Transformer-based hand gesture recognition from instantaneous to fused neural decomposition of high-density EMG signals

Designing efficient and labor-saving prosthetic hands requires powerful hand gesture recognition algorithms that can achieve high accuracy with limited complexity and latency. In this context, the paper proposes a Compact Transformer-based Hand Gesture Recognition framework referred to as [Formula:...

Descripción completa

Detalles Bibliográficos
Autores principales:	Montazerin, Mansooreh, Rahimian, Elahe, Naderkhani, Farnoosh, Atashzar, S. Farokh, Yanushkevich, Svetlana, Mohammadi, Arash
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10329032/ https://www.ncbi.nlm.nih.gov/pubmed/37419881 http://dx.doi.org/10.1038/s41598-023-36490-w

_version_	1785069936735944704
author	Montazerin, Mansooreh Rahimian, Elahe Naderkhani, Farnoosh Atashzar, S. Farokh Yanushkevich, Svetlana Mohammadi, Arash
author_facet	Montazerin, Mansooreh Rahimian, Elahe Naderkhani, Farnoosh Atashzar, S. Farokh Yanushkevich, Svetlana Mohammadi, Arash
author_sort	Montazerin, Mansooreh
collection	PubMed
description	Designing efficient and labor-saving prosthetic hands requires powerful hand gesture recognition algorithms that can achieve high accuracy with limited complexity and latency. In this context, the paper proposes a Compact Transformer-based Hand Gesture Recognition framework referred to as [Formula: see text] , which employs a vision transformer network to conduct hand gesture recognition using high-density surface EMG (HD-sEMG) signals. Taking advantage of the attention mechanism, which is incorporated into the transformer architectures, our proposed [Formula: see text] framework overcomes major constraints associated with most of the existing deep learning models such as model complexity; requiring feature engineering; inability to consider both temporal and spatial information of HD-sEMG signals, and requiring a large number of training samples. The attention mechanism in the proposed model identifies similarities among different data segments with a greater capacity for parallel computations and addresses the memory limitation problems while dealing with inputs of large sequence lengths. [Formula: see text] can be trained from scratch without any need for transfer learning and can simultaneously extract both temporal and spatial features of HD-sEMG data. Additionally, the [Formula: see text] framework can perform instantaneous recognition using sEMG image spatially composed from HD-sEMG signals. A variant of the [Formula: see text] is also designed to incorporate microscopic neural drive information in the form of Motor Unit Spike Trains (MUSTs) extracted from HD-sEMG signals using Blind Source Separation (BSS). This variant is combined with its baseline version via a hybrid architecture to evaluate potentials of fusing macroscopic and microscopic neural drive information. The utilized HD-sEMG dataset involves 128 electrodes that collect the signals related to 65 isometric hand gestures of 20 subjects. The proposed [Formula: see text] framework is applied to 31.25, 62.5, 125, 250 ms window sizes of the above-mentioned dataset utilizing 32, 64, 128 electrode channels. Our results are obtained via 5-fold cross-validation by first applying the proposed framework on the dataset of each subject separately and then, averaging the accuracies among all the subjects. The average accuracy over all the participants using 32 electrodes and a window size of 31.25 ms is 86.23%, which gradually increases till reaching 91.98% for 128 electrodes and a window size of 250 ms. The [Formula: see text] achieves accuracy of 89.13% for instantaneous recognition based on a single frame of HD-sEMG image. The proposed model is statistically compared with a 3D Convolutional Neural Network (CNN) and two different variants of Support Vector Machine (SVM) and Linear Discriminant Analysis (LDA) models. The accuracy results for each of the above-mentioned models are paired with their precision, recall, F1 score, required memory, and train/test times. The results corroborate effectiveness of the proposed [Formula: see text] framework compared to its counterparts.
format	Online Article Text
id	pubmed-10329032
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-103290322023-07-09 Transformer-based hand gesture recognition from instantaneous to fused neural decomposition of high-density EMG signals Montazerin, Mansooreh Rahimian, Elahe Naderkhani, Farnoosh Atashzar, S. Farokh Yanushkevich, Svetlana Mohammadi, Arash Sci Rep Article Designing efficient and labor-saving prosthetic hands requires powerful hand gesture recognition algorithms that can achieve high accuracy with limited complexity and latency. In this context, the paper proposes a Compact Transformer-based Hand Gesture Recognition framework referred to as [Formula: see text] , which employs a vision transformer network to conduct hand gesture recognition using high-density surface EMG (HD-sEMG) signals. Taking advantage of the attention mechanism, which is incorporated into the transformer architectures, our proposed [Formula: see text] framework overcomes major constraints associated with most of the existing deep learning models such as model complexity; requiring feature engineering; inability to consider both temporal and spatial information of HD-sEMG signals, and requiring a large number of training samples. The attention mechanism in the proposed model identifies similarities among different data segments with a greater capacity for parallel computations and addresses the memory limitation problems while dealing with inputs of large sequence lengths. [Formula: see text] can be trained from scratch without any need for transfer learning and can simultaneously extract both temporal and spatial features of HD-sEMG data. Additionally, the [Formula: see text] framework can perform instantaneous recognition using sEMG image spatially composed from HD-sEMG signals. A variant of the [Formula: see text] is also designed to incorporate microscopic neural drive information in the form of Motor Unit Spike Trains (MUSTs) extracted from HD-sEMG signals using Blind Source Separation (BSS). This variant is combined with its baseline version via a hybrid architecture to evaluate potentials of fusing macroscopic and microscopic neural drive information. The utilized HD-sEMG dataset involves 128 electrodes that collect the signals related to 65 isometric hand gestures of 20 subjects. The proposed [Formula: see text] framework is applied to 31.25, 62.5, 125, 250 ms window sizes of the above-mentioned dataset utilizing 32, 64, 128 electrode channels. Our results are obtained via 5-fold cross-validation by first applying the proposed framework on the dataset of each subject separately and then, averaging the accuracies among all the subjects. The average accuracy over all the participants using 32 electrodes and a window size of 31.25 ms is 86.23%, which gradually increases till reaching 91.98% for 128 electrodes and a window size of 250 ms. The [Formula: see text] achieves accuracy of 89.13% for instantaneous recognition based on a single frame of HD-sEMG image. The proposed model is statistically compared with a 3D Convolutional Neural Network (CNN) and two different variants of Support Vector Machine (SVM) and Linear Discriminant Analysis (LDA) models. The accuracy results for each of the above-mentioned models are paired with their precision, recall, F1 score, required memory, and train/test times. The results corroborate effectiveness of the proposed [Formula: see text] framework compared to its counterparts. Nature Publishing Group UK 2023-07-07 /pmc/articles/PMC10329032/ /pubmed/37419881 http://dx.doi.org/10.1038/s41598-023-36490-w Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Article Montazerin, Mansooreh Rahimian, Elahe Naderkhani, Farnoosh Atashzar, S. Farokh Yanushkevich, Svetlana Mohammadi, Arash Transformer-based hand gesture recognition from instantaneous to fused neural decomposition of high-density EMG signals
title	Transformer-based hand gesture recognition from instantaneous to fused neural decomposition of high-density EMG signals
title_full	Transformer-based hand gesture recognition from instantaneous to fused neural decomposition of high-density EMG signals
title_fullStr	Transformer-based hand gesture recognition from instantaneous to fused neural decomposition of high-density EMG signals
title_full_unstemmed	Transformer-based hand gesture recognition from instantaneous to fused neural decomposition of high-density EMG signals
title_short	Transformer-based hand gesture recognition from instantaneous to fused neural decomposition of high-density EMG signals
title_sort	transformer-based hand gesture recognition from instantaneous to fused neural decomposition of high-density emg signals
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10329032/ https://www.ncbi.nlm.nih.gov/pubmed/37419881 http://dx.doi.org/10.1038/s41598-023-36490-w
work_keys_str_mv	AT montazerinmansooreh transformerbasedhandgesturerecognitionfrominstantaneoustofusedneuraldecompositionofhighdensityemgsignals AT rahimianelahe transformerbasedhandgesturerecognitionfrominstantaneoustofusedneuraldecompositionofhighdensityemgsignals AT naderkhanifarnoosh transformerbasedhandgesturerecognitionfrominstantaneoustofusedneuraldecompositionofhighdensityemgsignals AT atashzarsfarokh transformerbasedhandgesturerecognitionfrominstantaneoustofusedneuraldecompositionofhighdensityemgsignals AT yanushkevichsvetlana transformerbasedhandgesturerecognitionfrominstantaneoustofusedneuraldecompositionofhighdensityemgsignals AT mohammadiarash transformerbasedhandgesturerecognitionfrominstantaneoustofusedneuraldecompositionofhighdensityemgsignals

Transformer-based hand gesture recognition from instantaneous to fused neural decomposition of high-density EMG signals

Ejemplares similares