Cargando…
One Model is Not Enough: Ensembles for Isolated Sign Language Recognition
In this paper, we dive into sign language recognition, focusing on the recognition of isolated signs. The task is defined as a classification problem, where a sequence of frames (i.e., images) is recognized as one of the given sign language glosses. We analyze two appearance-based approaches, I3D an...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9269724/ https://www.ncbi.nlm.nih.gov/pubmed/35808537 http://dx.doi.org/10.3390/s22135043 |
_version_ | 1784744290782543872 |
---|---|
author | Hrúz, Marek Gruber, Ivan Kanis, Jakub Boháček, Matyáš Hlaváč, Miroslav Krňoul, Zdeněk |
author_facet | Hrúz, Marek Gruber, Ivan Kanis, Jakub Boháček, Matyáš Hlaváč, Miroslav Krňoul, Zdeněk |
author_sort | Hrúz, Marek |
collection | PubMed |
description | In this paper, we dive into sign language recognition, focusing on the recognition of isolated signs. The task is defined as a classification problem, where a sequence of frames (i.e., images) is recognized as one of the given sign language glosses. We analyze two appearance-based approaches, I3D and TimeSformer, and one pose-based approach, SPOTER. The appearance-based approaches are trained on a few different data modalities, whereas the performance of SPOTER is evaluated on different types of preprocessing. All the methods are tested on two publicly available datasets: AUTSL and WLASL300. We experiment with ensemble techniques to achieve new state-of-the-art results of 73.84% accuracy on the WLASL300 dataset by using the CMA-ES optimization method to find the best ensemble weight parameters. Furthermore, we present an ensembling technique based on the Transformer model, which we call Neural Ensembler. |
format | Online Article Text |
id | pubmed-9269724 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-92697242022-07-09 One Model is Not Enough: Ensembles for Isolated Sign Language Recognition Hrúz, Marek Gruber, Ivan Kanis, Jakub Boháček, Matyáš Hlaváč, Miroslav Krňoul, Zdeněk Sensors (Basel) Article In this paper, we dive into sign language recognition, focusing on the recognition of isolated signs. The task is defined as a classification problem, where a sequence of frames (i.e., images) is recognized as one of the given sign language glosses. We analyze two appearance-based approaches, I3D and TimeSformer, and one pose-based approach, SPOTER. The appearance-based approaches are trained on a few different data modalities, whereas the performance of SPOTER is evaluated on different types of preprocessing. All the methods are tested on two publicly available datasets: AUTSL and WLASL300. We experiment with ensemble techniques to achieve new state-of-the-art results of 73.84% accuracy on the WLASL300 dataset by using the CMA-ES optimization method to find the best ensemble weight parameters. Furthermore, we present an ensembling technique based on the Transformer model, which we call Neural Ensembler. MDPI 2022-07-04 /pmc/articles/PMC9269724/ /pubmed/35808537 http://dx.doi.org/10.3390/s22135043 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Hrúz, Marek Gruber, Ivan Kanis, Jakub Boháček, Matyáš Hlaváč, Miroslav Krňoul, Zdeněk One Model is Not Enough: Ensembles for Isolated Sign Language Recognition |
title | One Model is Not Enough: Ensembles for Isolated Sign Language Recognition |
title_full | One Model is Not Enough: Ensembles for Isolated Sign Language Recognition |
title_fullStr | One Model is Not Enough: Ensembles for Isolated Sign Language Recognition |
title_full_unstemmed | One Model is Not Enough: Ensembles for Isolated Sign Language Recognition |
title_short | One Model is Not Enough: Ensembles for Isolated Sign Language Recognition |
title_sort | one model is not enough: ensembles for isolated sign language recognition |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9269724/ https://www.ncbi.nlm.nih.gov/pubmed/35808537 http://dx.doi.org/10.3390/s22135043 |
work_keys_str_mv | AT hruzmarek onemodelisnotenoughensemblesforisolatedsignlanguagerecognition AT gruberivan onemodelisnotenoughensemblesforisolatedsignlanguagerecognition AT kanisjakub onemodelisnotenoughensemblesforisolatedsignlanguagerecognition AT bohacekmatyas onemodelisnotenoughensemblesforisolatedsignlanguagerecognition AT hlavacmiroslav onemodelisnotenoughensemblesforisolatedsignlanguagerecognition AT krnoulzdenek onemodelisnotenoughensemblesforisolatedsignlanguagerecognition |