Cargando…

Marginalised Stacked Denoising Autoencoders for Robust Representation of Real-Time Multi-View Action Recognition

Multi-view action recognition has gained a great interest in video surveillance, human computer interaction, and multimedia retrieval, where multiple cameras of different types are deployed to provide a complementary field of views. Fusion of multiple camera views evidently leads to more robust deci...

Descripción completa

Detalles Bibliográficos
Autores principales: Gu, Feng, Flórez-Revuelta, Francisco, Monekosso, Dorothy, Remagnino, Paolo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4541930/
https://www.ncbi.nlm.nih.gov/pubmed/26193271
http://dx.doi.org/10.3390/s150717209
_version_ 1782386465946206208
author Gu, Feng
Flórez-Revuelta, Francisco
Monekosso, Dorothy
Remagnino, Paolo
author_facet Gu, Feng
Flórez-Revuelta, Francisco
Monekosso, Dorothy
Remagnino, Paolo
author_sort Gu, Feng
collection PubMed
description Multi-view action recognition has gained a great interest in video surveillance, human computer interaction, and multimedia retrieval, where multiple cameras of different types are deployed to provide a complementary field of views. Fusion of multiple camera views evidently leads to more robust decisions on both tracking multiple targets and analysing complex human activities, especially where there are occlusions. In this paper, we incorporate the marginalised stacked denoising autoencoders (mSDA) algorithm to further improve the bag of words (BoWs) representation in terms of robustness and usefulness for multi-view action recognition. The resulting representations are fed into three simple fusion strategies as well as a multiple kernel learning algorithm at the classification stage. Based on the internal evaluation, the codebook size of BoWs and the number of layers of mSDA may not significantly affect recognition performance. According to results on three multi-view benchmark datasets, the proposed framework improves recognition performance across all three datasets and outputs record recognition performance, beating the state-of-art algorithms in the literature. It is also capable of performing real-time action recognition at a frame rate ranging from 33 to 45, which could be further improved by using more powerful machines in future applications.
format Online
Article
Text
id pubmed-4541930
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-45419302015-08-26 Marginalised Stacked Denoising Autoencoders for Robust Representation of Real-Time Multi-View Action Recognition Gu, Feng Flórez-Revuelta, Francisco Monekosso, Dorothy Remagnino, Paolo Sensors (Basel) Article Multi-view action recognition has gained a great interest in video surveillance, human computer interaction, and multimedia retrieval, where multiple cameras of different types are deployed to provide a complementary field of views. Fusion of multiple camera views evidently leads to more robust decisions on both tracking multiple targets and analysing complex human activities, especially where there are occlusions. In this paper, we incorporate the marginalised stacked denoising autoencoders (mSDA) algorithm to further improve the bag of words (BoWs) representation in terms of robustness and usefulness for multi-view action recognition. The resulting representations are fed into three simple fusion strategies as well as a multiple kernel learning algorithm at the classification stage. Based on the internal evaluation, the codebook size of BoWs and the number of layers of mSDA may not significantly affect recognition performance. According to results on three multi-view benchmark datasets, the proposed framework improves recognition performance across all three datasets and outputs record recognition performance, beating the state-of-art algorithms in the literature. It is also capable of performing real-time action recognition at a frame rate ranging from 33 to 45, which could be further improved by using more powerful machines in future applications. MDPI 2015-07-16 /pmc/articles/PMC4541930/ /pubmed/26193271 http://dx.doi.org/10.3390/s150717209 Text en © 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Gu, Feng
Flórez-Revuelta, Francisco
Monekosso, Dorothy
Remagnino, Paolo
Marginalised Stacked Denoising Autoencoders for Robust Representation of Real-Time Multi-View Action Recognition
title Marginalised Stacked Denoising Autoencoders for Robust Representation of Real-Time Multi-View Action Recognition
title_full Marginalised Stacked Denoising Autoencoders for Robust Representation of Real-Time Multi-View Action Recognition
title_fullStr Marginalised Stacked Denoising Autoencoders for Robust Representation of Real-Time Multi-View Action Recognition
title_full_unstemmed Marginalised Stacked Denoising Autoencoders for Robust Representation of Real-Time Multi-View Action Recognition
title_short Marginalised Stacked Denoising Autoencoders for Robust Representation of Real-Time Multi-View Action Recognition
title_sort marginalised stacked denoising autoencoders for robust representation of real-time multi-view action recognition
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4541930/
https://www.ncbi.nlm.nih.gov/pubmed/26193271
http://dx.doi.org/10.3390/s150717209
work_keys_str_mv AT gufeng marginalisedstackeddenoisingautoencodersforrobustrepresentationofrealtimemultiviewactionrecognition
AT florezrevueltafrancisco marginalisedstackeddenoisingautoencodersforrobustrepresentationofrealtimemultiviewactionrecognition
AT monekossodorothy marginalisedstackeddenoisingautoencodersforrobustrepresentationofrealtimemultiviewactionrecognition
AT remagninopaolo marginalisedstackeddenoisingautoencodersforrobustrepresentationofrealtimemultiviewactionrecognition