Cargando…

Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks †

This research investigated real-time fingertip detection in frames captured from the increasingly popular wearable device, smart glasses. The egocentric-view fingertip detection and character recognition can be used to create a novel way of inputting texts. We first employed Unity3D to build a synth...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Yung-Han, Huang, Chi-Hsuan, Syu, Sin-Wun, Kuo, Tien-Ying, Su, Po-Chyi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8272142/
https://www.ncbi.nlm.nih.gov/pubmed/34206768
http://dx.doi.org/10.3390/s21134382
_version_ 1783721156070604800
author Chen, Yung-Han
Huang, Chi-Hsuan
Syu, Sin-Wun
Kuo, Tien-Ying
Su, Po-Chyi
author_facet Chen, Yung-Han
Huang, Chi-Hsuan
Syu, Sin-Wun
Kuo, Tien-Ying
Su, Po-Chyi
author_sort Chen, Yung-Han
collection PubMed
description This research investigated real-time fingertip detection in frames captured from the increasingly popular wearable device, smart glasses. The egocentric-view fingertip detection and character recognition can be used to create a novel way of inputting texts. We first employed Unity3D to build a synthetic dataset with pointing gestures from the first-person perspective. The obvious benefits of using synthetic data are that they eliminate the need for time-consuming and error-prone manual labeling and they provide a large and high-quality dataset for a wide range of purposes. Following that, a modified Mask Regional Convolutional Neural Network (Mask R-CNN) is proposed, consisting of a region-based CNN for finger detection and a three-layer CNN for fingertip location. The process can be completed in 25 ms per frame for 6 [Formula: see text] RGB images, with an average error of [Formula: see text] pixels. The speed is high enough to enable real-time “air-writing”, where users are able to write characters in the air to input texts or commands while wearing smart glasses. The characters can be recognized by a ResNet-based CNN from the fingertip trajectories. Experimental results demonstrate the feasibility of this novel methodology.
format Online
Article
Text
id pubmed-8272142
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-82721422021-07-11 Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks † Chen, Yung-Han Huang, Chi-Hsuan Syu, Sin-Wun Kuo, Tien-Ying Su, Po-Chyi Sensors (Basel) Article This research investigated real-time fingertip detection in frames captured from the increasingly popular wearable device, smart glasses. The egocentric-view fingertip detection and character recognition can be used to create a novel way of inputting texts. We first employed Unity3D to build a synthetic dataset with pointing gestures from the first-person perspective. The obvious benefits of using synthetic data are that they eliminate the need for time-consuming and error-prone manual labeling and they provide a large and high-quality dataset for a wide range of purposes. Following that, a modified Mask Regional Convolutional Neural Network (Mask R-CNN) is proposed, consisting of a region-based CNN for finger detection and a three-layer CNN for fingertip location. The process can be completed in 25 ms per frame for 6 [Formula: see text] RGB images, with an average error of [Formula: see text] pixels. The speed is high enough to enable real-time “air-writing”, where users are able to write characters in the air to input texts or commands while wearing smart glasses. The characters can be recognized by a ResNet-based CNN from the fingertip trajectories. Experimental results demonstrate the feasibility of this novel methodology. MDPI 2021-06-26 /pmc/articles/PMC8272142/ /pubmed/34206768 http://dx.doi.org/10.3390/s21134382 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Chen, Yung-Han
Huang, Chi-Hsuan
Syu, Sin-Wun
Kuo, Tien-Ying
Su, Po-Chyi
Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks †
title Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks †
title_full Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks †
title_fullStr Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks †
title_full_unstemmed Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks †
title_short Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks †
title_sort egocentric-view fingertip detection for air writing based on convolutional neural networks †
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8272142/
https://www.ncbi.nlm.nih.gov/pubmed/34206768
http://dx.doi.org/10.3390/s21134382
work_keys_str_mv AT chenyunghan egocentricviewfingertipdetectionforairwritingbasedonconvolutionalneuralnetworks
AT huangchihsuan egocentricviewfingertipdetectionforairwritingbasedonconvolutionalneuralnetworks
AT syusinwun egocentricviewfingertipdetectionforairwritingbasedonconvolutionalneuralnetworks
AT kuotienying egocentricviewfingertipdetectionforairwritingbasedonconvolutionalneuralnetworks
AT supochyi egocentricviewfingertipdetectionforairwritingbasedonconvolutionalneuralnetworks