Cargando…
Feature Fusion of Deep Spatial Features and Handcrafted Spatiotemporal Features for Human Action Recognition
Human action recognition plays a significant part in the research community due to its emerging applications. A variety of approaches have been proposed to resolve this problem, however, several issues still need to be addressed. In action recognition, effectively extracting and aggregating the spat...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6479698/ https://www.ncbi.nlm.nih.gov/pubmed/30987018 http://dx.doi.org/10.3390/s19071599 |
_version_ | 1783413405169745920 |
---|---|
author | Uddin, Md Azher Lee, Young-Koo |
author_facet | Uddin, Md Azher Lee, Young-Koo |
author_sort | Uddin, Md Azher |
collection | PubMed |
description | Human action recognition plays a significant part in the research community due to its emerging applications. A variety of approaches have been proposed to resolve this problem, however, several issues still need to be addressed. In action recognition, effectively extracting and aggregating the spatial-temporal information plays a vital role to describe a video. In this research, we propose a novel approach to recognize human actions by considering both deep spatial features and handcrafted spatiotemporal features. Firstly, we extract the deep spatial features by employing a state-of-the-art deep convolutional network, namely Inception-Resnet-v2. Secondly, we introduce a novel handcrafted feature descriptor, namely Weber’s law based Volume Local Gradient Ternary Pattern (WVLGTP), which brings out the spatiotemporal features. It also considers the shape information by using gradient operation. Furthermore, Weber’s law based threshold value and the ternary pattern based on an adaptive local threshold is presented to effectively handle the noisy center pixel value. Besides, a multi-resolution approach for WVLGTP based on an averaging scheme is also presented. Afterward, both these extracted features are concatenated and feed to the Support Vector Machine to perform the classification. Lastly, the extensive experimental analysis shows that our proposed method outperforms state-of-the-art approaches in terms of accuracy. |
format | Online Article Text |
id | pubmed-6479698 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-64796982019-04-29 Feature Fusion of Deep Spatial Features and Handcrafted Spatiotemporal Features for Human Action Recognition Uddin, Md Azher Lee, Young-Koo Sensors (Basel) Article Human action recognition plays a significant part in the research community due to its emerging applications. A variety of approaches have been proposed to resolve this problem, however, several issues still need to be addressed. In action recognition, effectively extracting and aggregating the spatial-temporal information plays a vital role to describe a video. In this research, we propose a novel approach to recognize human actions by considering both deep spatial features and handcrafted spatiotemporal features. Firstly, we extract the deep spatial features by employing a state-of-the-art deep convolutional network, namely Inception-Resnet-v2. Secondly, we introduce a novel handcrafted feature descriptor, namely Weber’s law based Volume Local Gradient Ternary Pattern (WVLGTP), which brings out the spatiotemporal features. It also considers the shape information by using gradient operation. Furthermore, Weber’s law based threshold value and the ternary pattern based on an adaptive local threshold is presented to effectively handle the noisy center pixel value. Besides, a multi-resolution approach for WVLGTP based on an averaging scheme is also presented. Afterward, both these extracted features are concatenated and feed to the Support Vector Machine to perform the classification. Lastly, the extensive experimental analysis shows that our proposed method outperforms state-of-the-art approaches in terms of accuracy. MDPI 2019-04-02 /pmc/articles/PMC6479698/ /pubmed/30987018 http://dx.doi.org/10.3390/s19071599 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Uddin, Md Azher Lee, Young-Koo Feature Fusion of Deep Spatial Features and Handcrafted Spatiotemporal Features for Human Action Recognition |
title | Feature Fusion of Deep Spatial Features and Handcrafted Spatiotemporal Features for Human Action Recognition |
title_full | Feature Fusion of Deep Spatial Features and Handcrafted Spatiotemporal Features for Human Action Recognition |
title_fullStr | Feature Fusion of Deep Spatial Features and Handcrafted Spatiotemporal Features for Human Action Recognition |
title_full_unstemmed | Feature Fusion of Deep Spatial Features and Handcrafted Spatiotemporal Features for Human Action Recognition |
title_short | Feature Fusion of Deep Spatial Features and Handcrafted Spatiotemporal Features for Human Action Recognition |
title_sort | feature fusion of deep spatial features and handcrafted spatiotemporal features for human action recognition |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6479698/ https://www.ncbi.nlm.nih.gov/pubmed/30987018 http://dx.doi.org/10.3390/s19071599 |
work_keys_str_mv | AT uddinmdazher featurefusionofdeepspatialfeaturesandhandcraftedspatiotemporalfeaturesforhumanactionrecognition AT leeyoungkoo featurefusionofdeepspatialfeaturesandhandcraftedspatiotemporalfeaturesforhumanactionrecognition |