Cargando…

Contrastive self-supervised representation learning without negative samples for multimodal human action recognition

Action recognition is an important component of human-computer interaction, and multimodal feature representation and learning methods can be used to improve recognition performance due to the interrelation and complementarity between different modalities. However, due to the lack of large-scale lab...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yang, Huaigang, Ren, Ziliang, Yuan, Huaqiang, Xu, Zhenyu, Zhou, Jun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2023
Materias:	Neuroscience
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10354269/ https://www.ncbi.nlm.nih.gov/pubmed/37476841 http://dx.doi.org/10.3389/fnins.2023.1225312

_version_	1785074891764006912
author	Yang, Huaigang Ren, Ziliang Yuan, Huaqiang Xu, Zhenyu Zhou, Jun
author_facet	Yang, Huaigang Ren, Ziliang Yuan, Huaqiang Xu, Zhenyu Zhou, Jun
author_sort	Yang, Huaigang
collection	PubMed
description	Action recognition is an important component of human-computer interaction, and multimodal feature representation and learning methods can be used to improve recognition performance due to the interrelation and complementarity between different modalities. However, due to the lack of large-scale labeled samples, the performance of existing ConvNets-based methods are severely constrained. In this paper, a novel and effective multi-modal feature representation and contrastive self-supervised learning framework is proposed to improve the action recognition performance of models and the generalization ability of application scenarios. The proposed recognition framework employs weight sharing between two branches and does not require negative samples, which could effectively learn useful feature representations by using multimodal unlabeled data, e.g., skeleton sequence and inertial measurement unit signal (IMU). The extensive experiments are conducted on two benchmarks: UTD-MHAD and MMAct, and the results show that our proposed recognition framework outperforms both unimodal and multimodal baselines in action retrieval, semi-supervised learning, and zero-shot learning scenarios.
format	Online Article Text
id	pubmed-10354269
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-103542692023-07-20 Contrastive self-supervised representation learning without negative samples for multimodal human action recognition Yang, Huaigang Ren, Ziliang Yuan, Huaqiang Xu, Zhenyu Zhou, Jun Front Neurosci Neuroscience Action recognition is an important component of human-computer interaction, and multimodal feature representation and learning methods can be used to improve recognition performance due to the interrelation and complementarity between different modalities. However, due to the lack of large-scale labeled samples, the performance of existing ConvNets-based methods are severely constrained. In this paper, a novel and effective multi-modal feature representation and contrastive self-supervised learning framework is proposed to improve the action recognition performance of models and the generalization ability of application scenarios. The proposed recognition framework employs weight sharing between two branches and does not require negative samples, which could effectively learn useful feature representations by using multimodal unlabeled data, e.g., skeleton sequence and inertial measurement unit signal (IMU). The extensive experiments are conducted on two benchmarks: UTD-MHAD and MMAct, and the results show that our proposed recognition framework outperforms both unimodal and multimodal baselines in action retrieval, semi-supervised learning, and zero-shot learning scenarios. Frontiers Media S.A. 2023-07-05 /pmc/articles/PMC10354269/ /pubmed/37476841 http://dx.doi.org/10.3389/fnins.2023.1225312 Text en Copyright © 2023 Yang, Ren, Yuan, Xu and Zhou. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Neuroscience Yang, Huaigang Ren, Ziliang Yuan, Huaqiang Xu, Zhenyu Zhou, Jun Contrastive self-supervised representation learning without negative samples for multimodal human action recognition
title	Contrastive self-supervised representation learning without negative samples for multimodal human action recognition
title_full	Contrastive self-supervised representation learning without negative samples for multimodal human action recognition
title_fullStr	Contrastive self-supervised representation learning without negative samples for multimodal human action recognition
title_full_unstemmed	Contrastive self-supervised representation learning without negative samples for multimodal human action recognition
title_short	Contrastive self-supervised representation learning without negative samples for multimodal human action recognition
title_sort	contrastive self-supervised representation learning without negative samples for multimodal human action recognition
topic	Neuroscience
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10354269/ https://www.ncbi.nlm.nih.gov/pubmed/37476841 http://dx.doi.org/10.3389/fnins.2023.1225312
work_keys_str_mv	AT yanghuaigang contrastiveselfsupervisedrepresentationlearningwithoutnegativesamplesformultimodalhumanactionrecognition AT renziliang contrastiveselfsupervisedrepresentationlearningwithoutnegativesamplesformultimodalhumanactionrecognition AT yuanhuaqiang contrastiveselfsupervisedrepresentationlearningwithoutnegativesamplesformultimodalhumanactionrecognition AT xuzhenyu contrastiveselfsupervisedrepresentationlearningwithoutnegativesamplesformultimodalhumanactionrecognition AT zhoujun contrastiveselfsupervisedrepresentationlearningwithoutnegativesamplesformultimodalhumanactionrecognition

Contrastive self-supervised representation learning without negative samples for multimodal human action recognition

Ejemplares similares