Cargando…

Attention-Based Temporal Encoding Network with Background-Independent Motion Mask for Action Recognition

Convolutional neural network (CNN) has been leaping forward in recent years. However, the high dimensionality, rich human dynamic characteristics, and various kinds of background interference increase difficulty for traditional CNNs in capturing complicated motion data in videos. A novel framework n...

Descripción completa

Detalles Bibliográficos
Autores principales:	Weng, Zhengkui, Jin, Zhipeng, Chen, Shuangxi, Shen, Quanquan, Ren, Xiangyang, Li, Wuzhao
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi 2021
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8024088/ https://www.ncbi.nlm.nih.gov/pubmed/33859682 http://dx.doi.org/10.1155/2021/8890808

_version_	1783675239203340288
author	Weng, Zhengkui Jin, Zhipeng Chen, Shuangxi Shen, Quanquan Ren, Xiangyang Li, Wuzhao
author_facet	Weng, Zhengkui Jin, Zhipeng Chen, Shuangxi Shen, Quanquan Ren, Xiangyang Li, Wuzhao
author_sort	Weng, Zhengkui
collection	PubMed
description	Convolutional neural network (CNN) has been leaping forward in recent years. However, the high dimensionality, rich human dynamic characteristics, and various kinds of background interference increase difficulty for traditional CNNs in capturing complicated motion data in videos. A novel framework named the attention-based temporal encoding network (ATEN) with background-independent motion mask (BIMM) is proposed to achieve video action recognition here. Initially, we introduce one motion segmenting approach on the basis of boundary prior by associating with the minimal geodesic distance inside a weighted graph that is not directed. Then, we propose one dynamic contrast segmenting strategic procedure for segmenting the object that moves within complicated environments. Subsequently, we build the BIMM for enhancing the object that moves based on the suppression of the not relevant background inside the respective frame. Furthermore, we design one long-range attention system inside ATEN, capable of effectively remedying the dependency of sophisticated actions that are not periodic in a long term based on the more automatic focus on the semantical vital frames other than the equal process for overall sampled frames. For this reason, the attention mechanism is capable of suppressing the temporal redundancy and highlighting the discriminative frames. Lastly, the framework is assessed by using HMDB51 and UCF101 datasets. As revealed from the experimentally achieved results, our ATEN with BIMM gains 94.5% and 70.6% accuracy, respectively, which outperforms a number of existing methods on both datasets.
format	Online Article Text
id	pubmed-8024088
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Hindawi
record_format	MEDLINE/PubMed
spelling	pubmed-80240882021-04-14 Attention-Based Temporal Encoding Network with Background-Independent Motion Mask for Action Recognition Weng, Zhengkui Jin, Zhipeng Chen, Shuangxi Shen, Quanquan Ren, Xiangyang Li, Wuzhao Comput Intell Neurosci Research Article Convolutional neural network (CNN) has been leaping forward in recent years. However, the high dimensionality, rich human dynamic characteristics, and various kinds of background interference increase difficulty for traditional CNNs in capturing complicated motion data in videos. A novel framework named the attention-based temporal encoding network (ATEN) with background-independent motion mask (BIMM) is proposed to achieve video action recognition here. Initially, we introduce one motion segmenting approach on the basis of boundary prior by associating with the minimal geodesic distance inside a weighted graph that is not directed. Then, we propose one dynamic contrast segmenting strategic procedure for segmenting the object that moves within complicated environments. Subsequently, we build the BIMM for enhancing the object that moves based on the suppression of the not relevant background inside the respective frame. Furthermore, we design one long-range attention system inside ATEN, capable of effectively remedying the dependency of sophisticated actions that are not periodic in a long term based on the more automatic focus on the semantical vital frames other than the equal process for overall sampled frames. For this reason, the attention mechanism is capable of suppressing the temporal redundancy and highlighting the discriminative frames. Lastly, the framework is assessed by using HMDB51 and UCF101 datasets. As revealed from the experimentally achieved results, our ATEN with BIMM gains 94.5% and 70.6% accuracy, respectively, which outperforms a number of existing methods on both datasets. Hindawi 2021-03-27 /pmc/articles/PMC8024088/ /pubmed/33859682 http://dx.doi.org/10.1155/2021/8890808 Text en Copyright © 2021 Zhengkui Weng et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Weng, Zhengkui Jin, Zhipeng Chen, Shuangxi Shen, Quanquan Ren, Xiangyang Li, Wuzhao Attention-Based Temporal Encoding Network with Background-Independent Motion Mask for Action Recognition
title	Attention-Based Temporal Encoding Network with Background-Independent Motion Mask for Action Recognition
title_full	Attention-Based Temporal Encoding Network with Background-Independent Motion Mask for Action Recognition
title_fullStr	Attention-Based Temporal Encoding Network with Background-Independent Motion Mask for Action Recognition
title_full_unstemmed	Attention-Based Temporal Encoding Network with Background-Independent Motion Mask for Action Recognition
title_short	Attention-Based Temporal Encoding Network with Background-Independent Motion Mask for Action Recognition
title_sort	attention-based temporal encoding network with background-independent motion mask for action recognition
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8024088/ https://www.ncbi.nlm.nih.gov/pubmed/33859682 http://dx.doi.org/10.1155/2021/8890808
work_keys_str_mv	AT wengzhengkui attentionbasedtemporalencodingnetworkwithbackgroundindependentmotionmaskforactionrecognition AT jinzhipeng attentionbasedtemporalencodingnetworkwithbackgroundindependentmotionmaskforactionrecognition AT chenshuangxi attentionbasedtemporalencodingnetworkwithbackgroundindependentmotionmaskforactionrecognition AT shenquanquan attentionbasedtemporalencodingnetworkwithbackgroundindependentmotionmaskforactionrecognition AT renxiangyang attentionbasedtemporalencodingnetworkwithbackgroundindependentmotionmaskforactionrecognition AT liwuzhao attentionbasedtemporalencodingnetworkwithbackgroundindependentmotionmaskforactionrecognition

Attention-Based Temporal Encoding Network with Background-Independent Motion Mask for Action Recognition

Ejemplares similares