Cargando…
Action recognition based on multimode fusion for VR online platform
The current popular online communication platforms can convey information only in the form of text, voice, pictures, and other electronic means. The richness and reliability of information is not comparable to traditional face-to-face communication. The use of virtual reality (VR) technology for onl...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer London
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9955528/ https://www.ncbi.nlm.nih.gov/pubmed/37360810 http://dx.doi.org/10.1007/s10055-023-00773-4 |
_version_ | 1784894368930332672 |
---|---|
author | Li, Xuan Chen, Hengxin He, Shengdong Chen, Xinrun Dong, Shuang Yan, Ping Fang, Bin |
author_facet | Li, Xuan Chen, Hengxin He, Shengdong Chen, Xinrun Dong, Shuang Yan, Ping Fang, Bin |
author_sort | Li, Xuan |
collection | PubMed |
description | The current popular online communication platforms can convey information only in the form of text, voice, pictures, and other electronic means. The richness and reliability of information is not comparable to traditional face-to-face communication. The use of virtual reality (VR) technology for online communication is a viable alternative to face-to-face communication. In the current VR online communication platform, users are in a virtual world in the form of avatars, which can achieve “face-to-face” communication to a certain extent. However, the actions of the avatar do not follow the user, which makes the communication process less realistic. Decision-makers need to make decisions based on the behavior of VR users, but there are no effective methods for action data collection in VR environments. In our work, three modalities of nine actions from VR users are collected using a virtual reality head-mounted display (VR HMD) built-in sensors, RGB cameras and human pose estimation. Using these data and advanced multimodal fusion action recognition networks, we obtained a high accuracy action recognition model. In addition, we take advantage of the VR HMD to collect 3D position data and design a 2D key point augmentation scheme for VR users. Using the augmented 2D key point data and VR HMD sensor data, we can train action recognition models with high accuracy and strong stability. In data collection and experimental work, we focus our research on classroom scenes, and the results can be extended to other scenes. |
format | Online Article Text |
id | pubmed-9955528 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Springer London |
record_format | MEDLINE/PubMed |
spelling | pubmed-99555282023-02-28 Action recognition based on multimode fusion for VR online platform Li, Xuan Chen, Hengxin He, Shengdong Chen, Xinrun Dong, Shuang Yan, Ping Fang, Bin Virtual Real Original Article The current popular online communication platforms can convey information only in the form of text, voice, pictures, and other electronic means. The richness and reliability of information is not comparable to traditional face-to-face communication. The use of virtual reality (VR) technology for online communication is a viable alternative to face-to-face communication. In the current VR online communication platform, users are in a virtual world in the form of avatars, which can achieve “face-to-face” communication to a certain extent. However, the actions of the avatar do not follow the user, which makes the communication process less realistic. Decision-makers need to make decisions based on the behavior of VR users, but there are no effective methods for action data collection in VR environments. In our work, three modalities of nine actions from VR users are collected using a virtual reality head-mounted display (VR HMD) built-in sensors, RGB cameras and human pose estimation. Using these data and advanced multimodal fusion action recognition networks, we obtained a high accuracy action recognition model. In addition, we take advantage of the VR HMD to collect 3D position data and design a 2D key point augmentation scheme for VR users. Using the augmented 2D key point data and VR HMD sensor data, we can train action recognition models with high accuracy and strong stability. In data collection and experimental work, we focus our research on classroom scenes, and the results can be extended to other scenes. Springer London 2023-02-24 /pmc/articles/PMC9955528/ /pubmed/37360810 http://dx.doi.org/10.1007/s10055-023-00773-4 Text en © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2023, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Original Article Li, Xuan Chen, Hengxin He, Shengdong Chen, Xinrun Dong, Shuang Yan, Ping Fang, Bin Action recognition based on multimode fusion for VR online platform |
title | Action recognition based on multimode fusion for VR online platform |
title_full | Action recognition based on multimode fusion for VR online platform |
title_fullStr | Action recognition based on multimode fusion for VR online platform |
title_full_unstemmed | Action recognition based on multimode fusion for VR online platform |
title_short | Action recognition based on multimode fusion for VR online platform |
title_sort | action recognition based on multimode fusion for vr online platform |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9955528/ https://www.ncbi.nlm.nih.gov/pubmed/37360810 http://dx.doi.org/10.1007/s10055-023-00773-4 |
work_keys_str_mv | AT lixuan actionrecognitionbasedonmultimodefusionforvronlineplatform AT chenhengxin actionrecognitionbasedonmultimodefusionforvronlineplatform AT heshengdong actionrecognitionbasedonmultimodefusionforvronlineplatform AT chenxinrun actionrecognitionbasedonmultimodefusionforvronlineplatform AT dongshuang actionrecognitionbasedonmultimodefusionforvronlineplatform AT yanping actionrecognitionbasedonmultimodefusionforvronlineplatform AT fangbin actionrecognitionbasedonmultimodefusionforvronlineplatform |