Cargando…

Action recognition based on multimode fusion for VR online platform

The current popular online communication platforms can convey information only in the form of text, voice, pictures, and other electronic means. The richness and reliability of information is not comparable to traditional face-to-face communication. The use of virtual reality (VR) technology for onl...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Xuan, Chen, Hengxin, He, Shengdong, Chen, Xinrun, Dong, Shuang, Yan, Ping, Fang, Bin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer London 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9955528/
https://www.ncbi.nlm.nih.gov/pubmed/37360810
http://dx.doi.org/10.1007/s10055-023-00773-4
_version_ 1784894368930332672
author Li, Xuan
Chen, Hengxin
He, Shengdong
Chen, Xinrun
Dong, Shuang
Yan, Ping
Fang, Bin
author_facet Li, Xuan
Chen, Hengxin
He, Shengdong
Chen, Xinrun
Dong, Shuang
Yan, Ping
Fang, Bin
author_sort Li, Xuan
collection PubMed
description The current popular online communication platforms can convey information only in the form of text, voice, pictures, and other electronic means. The richness and reliability of information is not comparable to traditional face-to-face communication. The use of virtual reality (VR) technology for online communication is a viable alternative to face-to-face communication. In the current VR online communication platform, users are in a virtual world in the form of avatars, which can achieve “face-to-face” communication to a certain extent. However, the actions of the avatar do not follow the user, which makes the communication process less realistic. Decision-makers need to make decisions based on the behavior of VR users, but there are no effective methods for action data collection in VR environments. In our work, three modalities of nine actions from VR users are collected using a virtual reality head-mounted display (VR HMD) built-in sensors, RGB cameras and human pose estimation. Using these data and advanced multimodal fusion action recognition networks, we obtained a high accuracy action recognition model. In addition, we take advantage of the VR HMD to collect 3D position data and design a 2D key point augmentation scheme for VR users. Using the augmented 2D key point data and VR HMD sensor data, we can train action recognition models with high accuracy and strong stability. In data collection and experimental work, we focus our research on classroom scenes, and the results can be extended to other scenes.
format Online
Article
Text
id pubmed-9955528
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Springer London
record_format MEDLINE/PubMed
spelling pubmed-99555282023-02-28 Action recognition based on multimode fusion for VR online platform Li, Xuan Chen, Hengxin He, Shengdong Chen, Xinrun Dong, Shuang Yan, Ping Fang, Bin Virtual Real Original Article The current popular online communication platforms can convey information only in the form of text, voice, pictures, and other electronic means. The richness and reliability of information is not comparable to traditional face-to-face communication. The use of virtual reality (VR) technology for online communication is a viable alternative to face-to-face communication. In the current VR online communication platform, users are in a virtual world in the form of avatars, which can achieve “face-to-face” communication to a certain extent. However, the actions of the avatar do not follow the user, which makes the communication process less realistic. Decision-makers need to make decisions based on the behavior of VR users, but there are no effective methods for action data collection in VR environments. In our work, three modalities of nine actions from VR users are collected using a virtual reality head-mounted display (VR HMD) built-in sensors, RGB cameras and human pose estimation. Using these data and advanced multimodal fusion action recognition networks, we obtained a high accuracy action recognition model. In addition, we take advantage of the VR HMD to collect 3D position data and design a 2D key point augmentation scheme for VR users. Using the augmented 2D key point data and VR HMD sensor data, we can train action recognition models with high accuracy and strong stability. In data collection and experimental work, we focus our research on classroom scenes, and the results can be extended to other scenes. Springer London 2023-02-24 /pmc/articles/PMC9955528/ /pubmed/37360810 http://dx.doi.org/10.1007/s10055-023-00773-4 Text en © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2023, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Original Article
Li, Xuan
Chen, Hengxin
He, Shengdong
Chen, Xinrun
Dong, Shuang
Yan, Ping
Fang, Bin
Action recognition based on multimode fusion for VR online platform
title Action recognition based on multimode fusion for VR online platform
title_full Action recognition based on multimode fusion for VR online platform
title_fullStr Action recognition based on multimode fusion for VR online platform
title_full_unstemmed Action recognition based on multimode fusion for VR online platform
title_short Action recognition based on multimode fusion for VR online platform
title_sort action recognition based on multimode fusion for vr online platform
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9955528/
https://www.ncbi.nlm.nih.gov/pubmed/37360810
http://dx.doi.org/10.1007/s10055-023-00773-4
work_keys_str_mv AT lixuan actionrecognitionbasedonmultimodefusionforvronlineplatform
AT chenhengxin actionrecognitionbasedonmultimodefusionforvronlineplatform
AT heshengdong actionrecognitionbasedonmultimodefusionforvronlineplatform
AT chenxinrun actionrecognitionbasedonmultimodefusionforvronlineplatform
AT dongshuang actionrecognitionbasedonmultimodefusionforvronlineplatform
AT yanping actionrecognitionbasedonmultimodefusionforvronlineplatform
AT fangbin actionrecognitionbasedonmultimodefusionforvronlineplatform