Cargando…

Robust Action Recognition Using Multi-Scale Spatial-Temporal Concatenations of Local Features as Natural Action Structures

Human and many other animals can detect, recognize, and classify natural actions in a very short time. How this is achieved by the visual system and how to make machines understand natural actions have been the focus of neurobiological studies and computational modeling in the last several decades....

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhu, Xiaoyuan, Li, Meng, Li, Xiaojian, Yang, Zhiyong, Tsien, Joe Z.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2012
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3464264/ https://www.ncbi.nlm.nih.gov/pubmed/23056403 http://dx.doi.org/10.1371/journal.pone.0046686

_version_	1782245397032337408
author	Zhu, Xiaoyuan Li, Meng Li, Xiaojian Yang, Zhiyong Tsien, Joe Z.
author_facet	Zhu, Xiaoyuan Li, Meng Li, Xiaojian Yang, Zhiyong Tsien, Joe Z.
author_sort	Zhu, Xiaoyuan
collection	PubMed
description	Human and many other animals can detect, recognize, and classify natural actions in a very short time. How this is achieved by the visual system and how to make machines understand natural actions have been the focus of neurobiological studies and computational modeling in the last several decades. A key issue is what spatial-temporal features should be encoded and what the characteristics of their occurrences are in natural actions. Current global encoding schemes depend heavily on segmenting while local encoding schemes lack descriptive power. Here, we propose natural action structures, i.e., multi-size, multi-scale, spatial-temporal concatenations of local features, as the basic features for representing natural actions. In this concept, any action is a spatial-temporal concatenation of a set of natural action structures, which convey a full range of information about natural actions. We took several steps to extract these structures. First, we sampled a large number of sequences of patches at multiple spatial-temporal scales. Second, we performed independent component analysis on the patch sequences and classified the independent components into clusters. Finally, we compiled a large set of natural action structures, with each corresponding to a unique combination of the clusters at the selected spatial-temporal scales. To classify human actions, we used a set of informative natural action structures as inputs to two widely used models. We found that the natural action structures obtained here achieved a significantly better recognition performance than low-level features and that the performance was better than or comparable to the best current models. We also found that the classification performance with natural action structures as features was slightly affected by changes of scale and artificially added noise. We concluded that the natural action structures proposed here can be used as the basic encoding units of actions and may hold the key to natural action understanding.
format	Online Article Text
id	pubmed-3464264
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-34642642012-10-10 Robust Action Recognition Using Multi-Scale Spatial-Temporal Concatenations of Local Features as Natural Action Structures Zhu, Xiaoyuan Li, Meng Li, Xiaojian Yang, Zhiyong Tsien, Joe Z. PLoS One Research Article Human and many other animals can detect, recognize, and classify natural actions in a very short time. How this is achieved by the visual system and how to make machines understand natural actions have been the focus of neurobiological studies and computational modeling in the last several decades. A key issue is what spatial-temporal features should be encoded and what the characteristics of their occurrences are in natural actions. Current global encoding schemes depend heavily on segmenting while local encoding schemes lack descriptive power. Here, we propose natural action structures, i.e., multi-size, multi-scale, spatial-temporal concatenations of local features, as the basic features for representing natural actions. In this concept, any action is a spatial-temporal concatenation of a set of natural action structures, which convey a full range of information about natural actions. We took several steps to extract these structures. First, we sampled a large number of sequences of patches at multiple spatial-temporal scales. Second, we performed independent component analysis on the patch sequences and classified the independent components into clusters. Finally, we compiled a large set of natural action structures, with each corresponding to a unique combination of the clusters at the selected spatial-temporal scales. To classify human actions, we used a set of informative natural action structures as inputs to two widely used models. We found that the natural action structures obtained here achieved a significantly better recognition performance than low-level features and that the performance was better than or comparable to the best current models. We also found that the classification performance with natural action structures as features was slightly affected by changes of scale and artificially added noise. We concluded that the natural action structures proposed here can be used as the basic encoding units of actions and may hold the key to natural action understanding. Public Library of Science 2012-10-04 /pmc/articles/PMC3464264/ /pubmed/23056403 http://dx.doi.org/10.1371/journal.pone.0046686 Text en © 2012 Zhu et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article Zhu, Xiaoyuan Li, Meng Li, Xiaojian Yang, Zhiyong Tsien, Joe Z. Robust Action Recognition Using Multi-Scale Spatial-Temporal Concatenations of Local Features as Natural Action Structures
title	Robust Action Recognition Using Multi-Scale Spatial-Temporal Concatenations of Local Features as Natural Action Structures
title_full	Robust Action Recognition Using Multi-Scale Spatial-Temporal Concatenations of Local Features as Natural Action Structures
title_fullStr	Robust Action Recognition Using Multi-Scale Spatial-Temporal Concatenations of Local Features as Natural Action Structures
title_full_unstemmed	Robust Action Recognition Using Multi-Scale Spatial-Temporal Concatenations of Local Features as Natural Action Structures
title_short	Robust Action Recognition Using Multi-Scale Spatial-Temporal Concatenations of Local Features as Natural Action Structures
title_sort	robust action recognition using multi-scale spatial-temporal concatenations of local features as natural action structures
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3464264/ https://www.ncbi.nlm.nih.gov/pubmed/23056403 http://dx.doi.org/10.1371/journal.pone.0046686
work_keys_str_mv	AT zhuxiaoyuan robustactionrecognitionusingmultiscalespatialtemporalconcatenationsoflocalfeaturesasnaturalactionstructures AT limeng robustactionrecognitionusingmultiscalespatialtemporalconcatenationsoflocalfeaturesasnaturalactionstructures AT lixiaojian robustactionrecognitionusingmultiscalespatialtemporalconcatenationsoflocalfeaturesasnaturalactionstructures AT yangzhiyong robustactionrecognitionusingmultiscalespatialtemporalconcatenationsoflocalfeaturesasnaturalactionstructures AT tsienjoez robustactionrecognitionusingmultiscalespatialtemporalconcatenationsoflocalfeaturesasnaturalactionstructures

Robust Action Recognition Using Multi-Scale Spatial-Temporal Concatenations of Local Features as Natural Action Structures

Ejemplares similares