Cargando…
Energy-Guided Temporal Segmentation Network for Multimodal Human Action Recognition
To achieve the satisfactory performance of human action recognition, a central task is to address the sub-action sharing problem, especially in similar action classes. Nevertheless, most existing convolutional neural network (CNN)-based action recognition algorithms uniformly divide video into frame...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7506802/ https://www.ncbi.nlm.nih.gov/pubmed/32825038 http://dx.doi.org/10.3390/s20174673 |
_version_ | 1783585097526542336 |
---|---|
author | Liu, Qiang Chen, Enqing Gao, Lei Liang, Chengwu Liu, Hao |
author_facet | Liu, Qiang Chen, Enqing Gao, Lei Liang, Chengwu Liu, Hao |
author_sort | Liu, Qiang |
collection | PubMed |
description | To achieve the satisfactory performance of human action recognition, a central task is to address the sub-action sharing problem, especially in similar action classes. Nevertheless, most existing convolutional neural network (CNN)-based action recognition algorithms uniformly divide video into frames and then randomly select the frames as inputs, ignoring the distinct characteristics among different frames. In recent years, depth videos have been increasingly used for action recognition, but most methods merely focus on the spatial information of the different actions without utilizing temporal information. In order to address these issues, a novel energy-guided temporal segmentation method is proposed here, and a multimodal fusion strategy is employed with the proposed segmentation method to construct an energy-guided temporal segmentation network (EGTSN). Specifically, the EGTSN had two parts: energy-guided video segmentation and a multimodal fusion heterogeneous CNN. The proposed solution was evaluated on a public large-scale NTU RGB+D dataset. Comparisons with state-of-the-art methods demonstrate the effectiveness of the proposed network. |
format | Online Article Text |
id | pubmed-7506802 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-75068022020-09-26 Energy-Guided Temporal Segmentation Network for Multimodal Human Action Recognition Liu, Qiang Chen, Enqing Gao, Lei Liang, Chengwu Liu, Hao Sensors (Basel) Article To achieve the satisfactory performance of human action recognition, a central task is to address the sub-action sharing problem, especially in similar action classes. Nevertheless, most existing convolutional neural network (CNN)-based action recognition algorithms uniformly divide video into frames and then randomly select the frames as inputs, ignoring the distinct characteristics among different frames. In recent years, depth videos have been increasingly used for action recognition, but most methods merely focus on the spatial information of the different actions without utilizing temporal information. In order to address these issues, a novel energy-guided temporal segmentation method is proposed here, and a multimodal fusion strategy is employed with the proposed segmentation method to construct an energy-guided temporal segmentation network (EGTSN). Specifically, the EGTSN had two parts: energy-guided video segmentation and a multimodal fusion heterogeneous CNN. The proposed solution was evaluated on a public large-scale NTU RGB+D dataset. Comparisons with state-of-the-art methods demonstrate the effectiveness of the proposed network. MDPI 2020-08-19 /pmc/articles/PMC7506802/ /pubmed/32825038 http://dx.doi.org/10.3390/s20174673 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Liu, Qiang Chen, Enqing Gao, Lei Liang, Chengwu Liu, Hao Energy-Guided Temporal Segmentation Network for Multimodal Human Action Recognition |
title | Energy-Guided Temporal Segmentation Network for Multimodal Human Action Recognition |
title_full | Energy-Guided Temporal Segmentation Network for Multimodal Human Action Recognition |
title_fullStr | Energy-Guided Temporal Segmentation Network for Multimodal Human Action Recognition |
title_full_unstemmed | Energy-Guided Temporal Segmentation Network for Multimodal Human Action Recognition |
title_short | Energy-Guided Temporal Segmentation Network for Multimodal Human Action Recognition |
title_sort | energy-guided temporal segmentation network for multimodal human action recognition |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7506802/ https://www.ncbi.nlm.nih.gov/pubmed/32825038 http://dx.doi.org/10.3390/s20174673 |
work_keys_str_mv | AT liuqiang energyguidedtemporalsegmentationnetworkformultimodalhumanactionrecognition AT chenenqing energyguidedtemporalsegmentationnetworkformultimodalhumanactionrecognition AT gaolei energyguidedtemporalsegmentationnetworkformultimodalhumanactionrecognition AT liangchengwu energyguidedtemporalsegmentationnetworkformultimodalhumanactionrecognition AT liuhao energyguidedtemporalsegmentationnetworkformultimodalhumanactionrecognition |