Cargando…

“Reading Pictures Instead of Looking”: RGB-D Image-Based Action Recognition via Capsule Network and Kalman Filter

This paper proposes an action recognition algorithm based on the capsule network and Kalman filter called “Reading Pictures Instead of Looking” (RPIL). This method resolves the convolutional neural network’s over sensitivity to rotation and scaling and increases the interpretability of the model as...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhao, Botong, Wang, Yanjie, Su, Keke, Ren, Hong, Sun, Haichao
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8005215/ https://www.ncbi.nlm.nih.gov/pubmed/33810140 http://dx.doi.org/10.3390/s21062217

_version_	1783672083253821440
author	Zhao, Botong Wang, Yanjie Su, Keke Ren, Hong Sun, Haichao
author_facet	Zhao, Botong Wang, Yanjie Su, Keke Ren, Hong Sun, Haichao
author_sort	Zhao, Botong
collection	PubMed
description	This paper proposes an action recognition algorithm based on the capsule network and Kalman filter called “Reading Pictures Instead of Looking” (RPIL). This method resolves the convolutional neural network’s over sensitivity to rotation and scaling and increases the interpretability of the model as per the spatial coordinates in graphics. The capsule network is first used to obtain the components of the target human body. The detected parts and their attribute parameters (e.g., spatial coordinates, color) are then analyzed by Bert. A Kalman filter analyzes the predicted capsules and filters out any misinformation to prevent the action recognition results from being affected by incorrectly predicted capsules. The parameters between neuron layers are evaluated, then the structure is pruned into a dendritic network to enhance the computational efficiency of the algorithm. This minimizes the dependence of in-depth learning on the random features extracted by the CNN without sacrificing the model’s accuracy. The association between hidden layers of the neural network is also explained. With a 90% observation rate, the OAD dataset test precision is 83.3%, the ChaLearn Gesture dataset test precision is 72.2%, and the G3D dataset test precision is 86.5%. The RPILNet also satisfies real-time operation requirements (>30 fps).
format	Online Article Text
id	pubmed-8005215
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-80052152021-03-29 “Reading Pictures Instead of Looking”: RGB-D Image-Based Action Recognition via Capsule Network and Kalman Filter Zhao, Botong Wang, Yanjie Su, Keke Ren, Hong Sun, Haichao Sensors (Basel) Article This paper proposes an action recognition algorithm based on the capsule network and Kalman filter called “Reading Pictures Instead of Looking” (RPIL). This method resolves the convolutional neural network’s over sensitivity to rotation and scaling and increases the interpretability of the model as per the spatial coordinates in graphics. The capsule network is first used to obtain the components of the target human body. The detected parts and their attribute parameters (e.g., spatial coordinates, color) are then analyzed by Bert. A Kalman filter analyzes the predicted capsules and filters out any misinformation to prevent the action recognition results from being affected by incorrectly predicted capsules. The parameters between neuron layers are evaluated, then the structure is pruned into a dendritic network to enhance the computational efficiency of the algorithm. This minimizes the dependence of in-depth learning on the random features extracted by the CNN without sacrificing the model’s accuracy. The association between hidden layers of the neural network is also explained. With a 90% observation rate, the OAD dataset test precision is 83.3%, the ChaLearn Gesture dataset test precision is 72.2%, and the G3D dataset test precision is 86.5%. The RPILNet also satisfies real-time operation requirements (>30 fps). MDPI 2021-03-22 /pmc/articles/PMC8005215/ /pubmed/33810140 http://dx.doi.org/10.3390/s21062217 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Zhao, Botong Wang, Yanjie Su, Keke Ren, Hong Sun, Haichao “Reading Pictures Instead of Looking”: RGB-D Image-Based Action Recognition via Capsule Network and Kalman Filter
title	“Reading Pictures Instead of Looking”: RGB-D Image-Based Action Recognition via Capsule Network and Kalman Filter
title_full	“Reading Pictures Instead of Looking”: RGB-D Image-Based Action Recognition via Capsule Network and Kalman Filter
title_fullStr	“Reading Pictures Instead of Looking”: RGB-D Image-Based Action Recognition via Capsule Network and Kalman Filter
title_full_unstemmed	“Reading Pictures Instead of Looking”: RGB-D Image-Based Action Recognition via Capsule Network and Kalman Filter
title_short	“Reading Pictures Instead of Looking”: RGB-D Image-Based Action Recognition via Capsule Network and Kalman Filter
title_sort	“reading pictures instead of looking”: rgb-d image-based action recognition via capsule network and kalman filter
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8005215/ https://www.ncbi.nlm.nih.gov/pubmed/33810140 http://dx.doi.org/10.3390/s21062217
work_keys_str_mv	AT zhaobotong readingpicturesinsteadoflookingrgbdimagebasedactionrecognitionviacapsulenetworkandkalmanfilter AT wangyanjie readingpicturesinsteadoflookingrgbdimagebasedactionrecognitionviacapsulenetworkandkalmanfilter AT sukeke readingpicturesinsteadoflookingrgbdimagebasedactionrecognitionviacapsulenetworkandkalmanfilter AT renhong readingpicturesinsteadoflookingrgbdimagebasedactionrecognitionviacapsulenetworkandkalmanfilter AT sunhaichao readingpicturesinsteadoflookingrgbdimagebasedactionrecognitionviacapsulenetworkandkalmanfilter

“Reading Pictures Instead of Looking”: RGB-D Image-Based Action Recognition via Capsule Network and Kalman Filter

Ejemplares similares