Cargando…

Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks

Gesture spotting is an essential task for recognizing finger gestures used to control in-car touchless interfaces. Automated methods to achieve this task require to detect video segments where gestures are observed, to discard natural behaviors of users’ hands that may look as target gestures, and b...

Descripción completa

Detalles Bibliográficos
Autores principales: Benitez-Garcia, Gibran, Haris, Muhammad, Tsuda, Yoshiyuki, Ukita, Norimichi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7014506/
https://www.ncbi.nlm.nih.gov/pubmed/31963623
http://dx.doi.org/10.3390/s20020528
_version_ 1783496646631358464
author Benitez-Garcia, Gibran
Haris, Muhammad
Tsuda, Yoshiyuki
Ukita, Norimichi
author_facet Benitez-Garcia, Gibran
Haris, Muhammad
Tsuda, Yoshiyuki
Ukita, Norimichi
author_sort Benitez-Garcia, Gibran
collection PubMed
description Gesture spotting is an essential task for recognizing finger gestures used to control in-car touchless interfaces. Automated methods to achieve this task require to detect video segments where gestures are observed, to discard natural behaviors of users’ hands that may look as target gestures, and be able to work online. In this paper, we address these challenges with a recurrent neural architecture for online finger gesture spotting. We propose a multi-stream network merging hand and hand-location features, which help to discriminate target gestures from natural movements of the hand, since these may not happen in the same 3D spatial location. Our multi-stream recurrent neural network (RNN) recurrently learns semantic information, allowing to spot gestures online in long untrimmed video sequences. In order to validate our method, we collect a finger gesture dataset in an in-vehicle scenario of an autonomous car. 226 videos with more than 2100 continuous instances were captured with a depth sensor. On this dataset, our gesture spotting approach outperforms state-of-the-art methods with an improvement of about 10% and 15% of recall and precision, respectively. Furthermore, we demonstrated that by combining with an existing gesture classifier (a 3D Convolutional Neural Network), our proposal achieves better performance than previous hand gesture recognition methods.
format Online
Article
Text
id pubmed-7014506
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-70145062020-03-09 Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks Benitez-Garcia, Gibran Haris, Muhammad Tsuda, Yoshiyuki Ukita, Norimichi Sensors (Basel) Article Gesture spotting is an essential task for recognizing finger gestures used to control in-car touchless interfaces. Automated methods to achieve this task require to detect video segments where gestures are observed, to discard natural behaviors of users’ hands that may look as target gestures, and be able to work online. In this paper, we address these challenges with a recurrent neural architecture for online finger gesture spotting. We propose a multi-stream network merging hand and hand-location features, which help to discriminate target gestures from natural movements of the hand, since these may not happen in the same 3D spatial location. Our multi-stream recurrent neural network (RNN) recurrently learns semantic information, allowing to spot gestures online in long untrimmed video sequences. In order to validate our method, we collect a finger gesture dataset in an in-vehicle scenario of an autonomous car. 226 videos with more than 2100 continuous instances were captured with a depth sensor. On this dataset, our gesture spotting approach outperforms state-of-the-art methods with an improvement of about 10% and 15% of recall and precision, respectively. Furthermore, we demonstrated that by combining with an existing gesture classifier (a 3D Convolutional Neural Network), our proposal achieves better performance than previous hand gesture recognition methods. MDPI 2020-01-18 /pmc/articles/PMC7014506/ /pubmed/31963623 http://dx.doi.org/10.3390/s20020528 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Benitez-Garcia, Gibran
Haris, Muhammad
Tsuda, Yoshiyuki
Ukita, Norimichi
Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks
title Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks
title_full Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks
title_fullStr Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks
title_full_unstemmed Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks
title_short Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks
title_sort finger gesture spotting from long sequences based on multi-stream recurrent neural networks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7014506/
https://www.ncbi.nlm.nih.gov/pubmed/31963623
http://dx.doi.org/10.3390/s20020528
work_keys_str_mv AT benitezgarciagibran fingergesturespottingfromlongsequencesbasedonmultistreamrecurrentneuralnetworks
AT harismuhammad fingergesturespottingfromlongsequencesbasedonmultistreamrecurrentneuralnetworks
AT tsudayoshiyuki fingergesturespottingfromlongsequencesbasedonmultistreamrecurrentneuralnetworks
AT ukitanorimichi fingergesturespottingfromlongsequencesbasedonmultistreamrecurrentneuralnetworks