Cargando…

Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine

In this paper, a deep learning approach, Restricted Boltzmann Machine (RBM), is used to perform automatic hand sign language recognition from visual data. We evaluate how RBM, as a deep generative model, is capable of generating the distribution of the input data for an enhanced recognition of unsee...

Descripción completa

Detalles Bibliográficos
Autores principales:	Rastgoo, Razieh, Kiani, Kourosh, Escalera, Sergio
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2018
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7512373/ https://www.ncbi.nlm.nih.gov/pubmed/33266533 http://dx.doi.org/10.3390/e20110809

_version_	1783586142902288384
author	Rastgoo, Razieh Kiani, Kourosh Escalera, Sergio
author_facet	Rastgoo, Razieh Kiani, Kourosh Escalera, Sergio
author_sort	Rastgoo, Razieh
collection	PubMed
description	In this paper, a deep learning approach, Restricted Boltzmann Machine (RBM), is used to perform automatic hand sign language recognition from visual data. We evaluate how RBM, as a deep generative model, is capable of generating the distribution of the input data for an enhanced recognition of unseen data. Two modalities, RGB and Depth, are considered in the model input in three forms: original image, cropped image, and noisy cropped image. Five crops of the input image are used and the hand of these cropped images are detected using Convolutional Neural Network (CNN). After that, three types of the detected hand images are generated for each modality and input to RBMs. The outputs of the RBMs for two modalities are fused in another RBM in order to recognize the output sign label of the input image. The proposed multi-modal model is trained on all and part of the American alphabet and digits of four publicly available datasets. We also evaluate the robustness of the proposal against noise. Experimental results show that the proposed multi-modal model, using crops and the RBM fusing methodology, achieves state-of-the-art results on Massey University Gesture Dataset 2012, American Sign Language (ASL). and Fingerspelling Dataset from the University of Surrey’s Center for Vision, Speech and Signal Processing, NYU, and ASL Fingerspelling A datasets.
format	Online Article Text
id	pubmed-7512373
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-75123732020-11-09 Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine Rastgoo, Razieh Kiani, Kourosh Escalera, Sergio Entropy (Basel) Article In this paper, a deep learning approach, Restricted Boltzmann Machine (RBM), is used to perform automatic hand sign language recognition from visual data. We evaluate how RBM, as a deep generative model, is capable of generating the distribution of the input data for an enhanced recognition of unseen data. Two modalities, RGB and Depth, are considered in the model input in three forms: original image, cropped image, and noisy cropped image. Five crops of the input image are used and the hand of these cropped images are detected using Convolutional Neural Network (CNN). After that, three types of the detected hand images are generated for each modality and input to RBMs. The outputs of the RBMs for two modalities are fused in another RBM in order to recognize the output sign label of the input image. The proposed multi-modal model is trained on all and part of the American alphabet and digits of four publicly available datasets. We also evaluate the robustness of the proposal against noise. Experimental results show that the proposed multi-modal model, using crops and the RBM fusing methodology, achieves state-of-the-art results on Massey University Gesture Dataset 2012, American Sign Language (ASL). and Fingerspelling Dataset from the University of Surrey’s Center for Vision, Speech and Signal Processing, NYU, and ASL Fingerspelling A datasets. MDPI 2018-10-23 /pmc/articles/PMC7512373/ /pubmed/33266533 http://dx.doi.org/10.3390/e20110809 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Rastgoo, Razieh Kiani, Kourosh Escalera, Sergio Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine
title	Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine
title_full	Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine
title_fullStr	Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine
title_full_unstemmed	Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine
title_short	Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine
title_sort	multi-modal deep hand sign language recognition in still images using restricted boltzmann machine
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7512373/ https://www.ncbi.nlm.nih.gov/pubmed/33266533 http://dx.doi.org/10.3390/e20110809
work_keys_str_mv	AT rastgoorazieh multimodaldeephandsignlanguagerecognitioninstillimagesusingrestrictedboltzmannmachine AT kianikourosh multimodaldeephandsignlanguagerecognitioninstillimagesusingrestrictedboltzmannmachine AT escalerasergio multimodaldeephandsignlanguagerecognitioninstillimagesusingrestrictedboltzmannmachine

Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine

Ejemplares similares