Cargando…

Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks

Gesture spotting is an essential task for recognizing finger gestures used to control in-car touchless interfaces. Automated methods to achieve this task require to detect video segments where gestures are observed, to discard natural behaviors of users’ hands that may look as target gestures, and b...

Descripción completa

Detalles Bibliográficos
Autores principales:	Benitez-Garcia, Gibran, Haris, Muhammad, Tsuda, Yoshiyuki, Ukita, Norimichi
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7014506/ https://www.ncbi.nlm.nih.gov/pubmed/31963623 http://dx.doi.org/10.3390/s20020528

_version_	1783496646631358464
author	Benitez-Garcia, Gibran Haris, Muhammad Tsuda, Yoshiyuki Ukita, Norimichi
author_facet	Benitez-Garcia, Gibran Haris, Muhammad Tsuda, Yoshiyuki Ukita, Norimichi
author_sort	Benitez-Garcia, Gibran
collection	PubMed
description	Gesture spotting is an essential task for recognizing finger gestures used to control in-car touchless interfaces. Automated methods to achieve this task require to detect video segments where gestures are observed, to discard natural behaviors of users’ hands that may look as target gestures, and be able to work online. In this paper, we address these challenges with a recurrent neural architecture for online finger gesture spotting. We propose a multi-stream network merging hand and hand-location features, which help to discriminate target gestures from natural movements of the hand, since these may not happen in the same 3D spatial location. Our multi-stream recurrent neural network (RNN) recurrently learns semantic information, allowing to spot gestures online in long untrimmed video sequences. In order to validate our method, we collect a finger gesture dataset in an in-vehicle scenario of an autonomous car. 226 videos with more than 2100 continuous instances were captured with a depth sensor. On this dataset, our gesture spotting approach outperforms state-of-the-art methods with an improvement of about 10% and 15% of recall and precision, respectively. Furthermore, we demonstrated that by combining with an existing gesture classifier (a 3D Convolutional Neural Network), our proposal achieves better performance than previous hand gesture recognition methods.
format	Online Article Text
id	pubmed-7014506
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-70145062020-03-09 Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks Benitez-Garcia, Gibran Haris, Muhammad Tsuda, Yoshiyuki Ukita, Norimichi Sensors (Basel) Article Gesture spotting is an essential task for recognizing finger gestures used to control in-car touchless interfaces. Automated methods to achieve this task require to detect video segments where gestures are observed, to discard natural behaviors of users’ hands that may look as target gestures, and be able to work online. In this paper, we address these challenges with a recurrent neural architecture for online finger gesture spotting. We propose a multi-stream network merging hand and hand-location features, which help to discriminate target gestures from natural movements of the hand, since these may not happen in the same 3D spatial location. Our multi-stream recurrent neural network (RNN) recurrently learns semantic information, allowing to spot gestures online in long untrimmed video sequences. In order to validate our method, we collect a finger gesture dataset in an in-vehicle scenario of an autonomous car. 226 videos with more than 2100 continuous instances were captured with a depth sensor. On this dataset, our gesture spotting approach outperforms state-of-the-art methods with an improvement of about 10% and 15% of recall and precision, respectively. Furthermore, we demonstrated that by combining with an existing gesture classifier (a 3D Convolutional Neural Network), our proposal achieves better performance than previous hand gesture recognition methods. MDPI 2020-01-18 /pmc/articles/PMC7014506/ /pubmed/31963623 http://dx.doi.org/10.3390/s20020528 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Benitez-Garcia, Gibran Haris, Muhammad Tsuda, Yoshiyuki Ukita, Norimichi Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks
title	Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks
title_full	Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks
title_fullStr	Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks
title_full_unstemmed	Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks
title_short	Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks
title_sort	finger gesture spotting from long sequences based on multi-stream recurrent neural networks
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7014506/ https://www.ncbi.nlm.nih.gov/pubmed/31963623 http://dx.doi.org/10.3390/s20020528
work_keys_str_mv	AT benitezgarciagibran fingergesturespottingfromlongsequencesbasedonmultistreamrecurrentneuralnetworks AT harismuhammad fingergesturespottingfromlongsequencesbasedonmultistreamrecurrentneuralnetworks AT tsudayoshiyuki fingergesturespottingfromlongsequencesbasedonmultistreamrecurrentneuralnetworks AT ukitanorimichi fingergesturespottingfromlongsequencesbasedonmultistreamrecurrentneuralnetworks

Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks

Ejemplares similares