Cargando…

One Model is Not Enough: Ensembles for Isolated Sign Language Recognition

In this paper, we dive into sign language recognition, focusing on the recognition of isolated signs. The task is defined as a classification problem, where a sequence of frames (i.e., images) is recognized as one of the given sign language glosses. We analyze two appearance-based approaches, I3D an...

Descripción completa

Detalles Bibliográficos
Autores principales: Hrúz, Marek, Gruber, Ivan, Kanis, Jakub, Boháček, Matyáš, Hlaváč, Miroslav, Krňoul, Zdeněk
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9269724/
https://www.ncbi.nlm.nih.gov/pubmed/35808537
http://dx.doi.org/10.3390/s22135043
_version_ 1784744290782543872
author Hrúz, Marek
Gruber, Ivan
Kanis, Jakub
Boháček, Matyáš
Hlaváč, Miroslav
Krňoul, Zdeněk
author_facet Hrúz, Marek
Gruber, Ivan
Kanis, Jakub
Boháček, Matyáš
Hlaváč, Miroslav
Krňoul, Zdeněk
author_sort Hrúz, Marek
collection PubMed
description In this paper, we dive into sign language recognition, focusing on the recognition of isolated signs. The task is defined as a classification problem, where a sequence of frames (i.e., images) is recognized as one of the given sign language glosses. We analyze two appearance-based approaches, I3D and TimeSformer, and one pose-based approach, SPOTER. The appearance-based approaches are trained on a few different data modalities, whereas the performance of SPOTER is evaluated on different types of preprocessing. All the methods are tested on two publicly available datasets: AUTSL and WLASL300. We experiment with ensemble techniques to achieve new state-of-the-art results of 73.84% accuracy on the WLASL300 dataset by using the CMA-ES optimization method to find the best ensemble weight parameters. Furthermore, we present an ensembling technique based on the Transformer model, which we call Neural Ensembler.
format Online
Article
Text
id pubmed-9269724
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-92697242022-07-09 One Model is Not Enough: Ensembles for Isolated Sign Language Recognition Hrúz, Marek Gruber, Ivan Kanis, Jakub Boháček, Matyáš Hlaváč, Miroslav Krňoul, Zdeněk Sensors (Basel) Article In this paper, we dive into sign language recognition, focusing on the recognition of isolated signs. The task is defined as a classification problem, where a sequence of frames (i.e., images) is recognized as one of the given sign language glosses. We analyze two appearance-based approaches, I3D and TimeSformer, and one pose-based approach, SPOTER. The appearance-based approaches are trained on a few different data modalities, whereas the performance of SPOTER is evaluated on different types of preprocessing. All the methods are tested on two publicly available datasets: AUTSL and WLASL300. We experiment with ensemble techniques to achieve new state-of-the-art results of 73.84% accuracy on the WLASL300 dataset by using the CMA-ES optimization method to find the best ensemble weight parameters. Furthermore, we present an ensembling technique based on the Transformer model, which we call Neural Ensembler. MDPI 2022-07-04 /pmc/articles/PMC9269724/ /pubmed/35808537 http://dx.doi.org/10.3390/s22135043 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Hrúz, Marek
Gruber, Ivan
Kanis, Jakub
Boháček, Matyáš
Hlaváč, Miroslav
Krňoul, Zdeněk
One Model is Not Enough: Ensembles for Isolated Sign Language Recognition
title One Model is Not Enough: Ensembles for Isolated Sign Language Recognition
title_full One Model is Not Enough: Ensembles for Isolated Sign Language Recognition
title_fullStr One Model is Not Enough: Ensembles for Isolated Sign Language Recognition
title_full_unstemmed One Model is Not Enough: Ensembles for Isolated Sign Language Recognition
title_short One Model is Not Enough: Ensembles for Isolated Sign Language Recognition
title_sort one model is not enough: ensembles for isolated sign language recognition
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9269724/
https://www.ncbi.nlm.nih.gov/pubmed/35808537
http://dx.doi.org/10.3390/s22135043
work_keys_str_mv AT hruzmarek onemodelisnotenoughensemblesforisolatedsignlanguagerecognition
AT gruberivan onemodelisnotenoughensemblesforisolatedsignlanguagerecognition
AT kanisjakub onemodelisnotenoughensemblesforisolatedsignlanguagerecognition
AT bohacekmatyas onemodelisnotenoughensemblesforisolatedsignlanguagerecognition
AT hlavacmiroslav onemodelisnotenoughensemblesforisolatedsignlanguagerecognition
AT krnoulzdenek onemodelisnotenoughensemblesforisolatedsignlanguagerecognition