Cargando…

An Efficient Human Instance-Guided Framework for Video Action Recognition

In recent years, human action recognition has been studied by many computer vision researchers. Recent studies have attempted to use two-stream networks using appearance and motion features, but most of these approaches focused on clip-level video action recognition. In contrast to traditional metho...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Inwoong, Kim, Doyoung, Wee, Dongyoon, Lee, Sanghoon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8709376/
https://www.ncbi.nlm.nih.gov/pubmed/34960404
http://dx.doi.org/10.3390/s21248309
_version_ 1784622920151072768
author Lee, Inwoong
Kim, Doyoung
Wee, Dongyoon
Lee, Sanghoon
author_facet Lee, Inwoong
Kim, Doyoung
Wee, Dongyoon
Lee, Sanghoon
author_sort Lee, Inwoong
collection PubMed
description In recent years, human action recognition has been studied by many computer vision researchers. Recent studies have attempted to use two-stream networks using appearance and motion features, but most of these approaches focused on clip-level video action recognition. In contrast to traditional methods which generally used entire images, we propose a new human instance-level video action recognition framework. In this framework, we represent the instance-level features using human boxes and keypoints, and our action region features are used as the inputs of the temporal action head network, which makes our framework more discriminative. We also propose novel temporal action head networks consisting of various modules, which reflect various temporal dynamics well. In the experiment, the proposed models achieve comparable performance with the state-of-the-art approaches on two challenging datasets. Furthermore, we evaluate the proposed features and networks to verify the effectiveness of them. Finally, we analyze the confusion matrix and visualize the recognized actions at human instance level when there are several people.
format Online
Article
Text
id pubmed-8709376
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-87093762021-12-25 An Efficient Human Instance-Guided Framework for Video Action Recognition Lee, Inwoong Kim, Doyoung Wee, Dongyoon Lee, Sanghoon Sensors (Basel) Article In recent years, human action recognition has been studied by many computer vision researchers. Recent studies have attempted to use two-stream networks using appearance and motion features, but most of these approaches focused on clip-level video action recognition. In contrast to traditional methods which generally used entire images, we propose a new human instance-level video action recognition framework. In this framework, we represent the instance-level features using human boxes and keypoints, and our action region features are used as the inputs of the temporal action head network, which makes our framework more discriminative. We also propose novel temporal action head networks consisting of various modules, which reflect various temporal dynamics well. In the experiment, the proposed models achieve comparable performance with the state-of-the-art approaches on two challenging datasets. Furthermore, we evaluate the proposed features and networks to verify the effectiveness of them. Finally, we analyze the confusion matrix and visualize the recognized actions at human instance level when there are several people. MDPI 2021-12-12 /pmc/articles/PMC8709376/ /pubmed/34960404 http://dx.doi.org/10.3390/s21248309 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Lee, Inwoong
Kim, Doyoung
Wee, Dongyoon
Lee, Sanghoon
An Efficient Human Instance-Guided Framework for Video Action Recognition
title An Efficient Human Instance-Guided Framework for Video Action Recognition
title_full An Efficient Human Instance-Guided Framework for Video Action Recognition
title_fullStr An Efficient Human Instance-Guided Framework for Video Action Recognition
title_full_unstemmed An Efficient Human Instance-Guided Framework for Video Action Recognition
title_short An Efficient Human Instance-Guided Framework for Video Action Recognition
title_sort efficient human instance-guided framework for video action recognition
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8709376/
https://www.ncbi.nlm.nih.gov/pubmed/34960404
http://dx.doi.org/10.3390/s21248309
work_keys_str_mv AT leeinwoong anefficienthumaninstanceguidedframeworkforvideoactionrecognition
AT kimdoyoung anefficienthumaninstanceguidedframeworkforvideoactionrecognition
AT weedongyoon anefficienthumaninstanceguidedframeworkforvideoactionrecognition
AT leesanghoon anefficienthumaninstanceguidedframeworkforvideoactionrecognition
AT leeinwoong efficienthumaninstanceguidedframeworkforvideoactionrecognition
AT kimdoyoung efficienthumaninstanceguidedframeworkforvideoactionrecognition
AT weedongyoon efficienthumaninstanceguidedframeworkforvideoactionrecognition
AT leesanghoon efficienthumaninstanceguidedframeworkforvideoactionrecognition