Cargando…

Fine-Tuned Temporal Dense Sampling with 1D Convolutional Neural Network for Human Action Recognition

Human action recognition is a constantly evolving field that is driven by numerous applications. In recent years, significant progress has been made in this area due to the development of advanced representation learning techniques. Despite this progress, human action recognition still poses signifi...

Descripción completa

Detalles Bibliográficos
Autores principales: Lim, Kian Ming, Lee, Chin Poo, Tan, Kok Seang, Alqahtani, Ali, Ali, Mohammed
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10256091/
https://www.ncbi.nlm.nih.gov/pubmed/37300004
http://dx.doi.org/10.3390/s23115276
_version_ 1785057030374948864
author Lim, Kian Ming
Lee, Chin Poo
Tan, Kok Seang
Alqahtani, Ali
Ali, Mohammed
author_facet Lim, Kian Ming
Lee, Chin Poo
Tan, Kok Seang
Alqahtani, Ali
Ali, Mohammed
author_sort Lim, Kian Ming
collection PubMed
description Human action recognition is a constantly evolving field that is driven by numerous applications. In recent years, significant progress has been made in this area due to the development of advanced representation learning techniques. Despite this progress, human action recognition still poses significant challenges, particularly due to the unpredictable variations in the visual appearance of an image sequence. To address these challenges, we propose the fine-tuned temporal dense sampling with 1D convolutional neural network (FTDS-1DConvNet). Our method involves the use of temporal segmentation and temporal dense sampling, which help to capture the most important features of a human action video. First, the human action video is partitioned into segments through temporal segmentation. Each segment is then processed through a fine-tuned Inception-ResNet-V2 model, where max pooling is performed along the temporal axis to encode the most significant features as a fixed-length representation. This representation is then fed into a 1DConvNet for further representation learning and classification. The experiments on UCF101 and HMDB51 demonstrate that the proposed FTDS-1DConvNet outperforms the state-of-the-art methods, with a classification accuracy of 88.43% on the UCF101 dataset and 56.23% on the HMDB51 dataset.
format Online
Article
Text
id pubmed-10256091
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-102560912023-06-10 Fine-Tuned Temporal Dense Sampling with 1D Convolutional Neural Network for Human Action Recognition Lim, Kian Ming Lee, Chin Poo Tan, Kok Seang Alqahtani, Ali Ali, Mohammed Sensors (Basel) Article Human action recognition is a constantly evolving field that is driven by numerous applications. In recent years, significant progress has been made in this area due to the development of advanced representation learning techniques. Despite this progress, human action recognition still poses significant challenges, particularly due to the unpredictable variations in the visual appearance of an image sequence. To address these challenges, we propose the fine-tuned temporal dense sampling with 1D convolutional neural network (FTDS-1DConvNet). Our method involves the use of temporal segmentation and temporal dense sampling, which help to capture the most important features of a human action video. First, the human action video is partitioned into segments through temporal segmentation. Each segment is then processed through a fine-tuned Inception-ResNet-V2 model, where max pooling is performed along the temporal axis to encode the most significant features as a fixed-length representation. This representation is then fed into a 1DConvNet for further representation learning and classification. The experiments on UCF101 and HMDB51 demonstrate that the proposed FTDS-1DConvNet outperforms the state-of-the-art methods, with a classification accuracy of 88.43% on the UCF101 dataset and 56.23% on the HMDB51 dataset. MDPI 2023-06-02 /pmc/articles/PMC10256091/ /pubmed/37300004 http://dx.doi.org/10.3390/s23115276 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Lim, Kian Ming
Lee, Chin Poo
Tan, Kok Seang
Alqahtani, Ali
Ali, Mohammed
Fine-Tuned Temporal Dense Sampling with 1D Convolutional Neural Network for Human Action Recognition
title Fine-Tuned Temporal Dense Sampling with 1D Convolutional Neural Network for Human Action Recognition
title_full Fine-Tuned Temporal Dense Sampling with 1D Convolutional Neural Network for Human Action Recognition
title_fullStr Fine-Tuned Temporal Dense Sampling with 1D Convolutional Neural Network for Human Action Recognition
title_full_unstemmed Fine-Tuned Temporal Dense Sampling with 1D Convolutional Neural Network for Human Action Recognition
title_short Fine-Tuned Temporal Dense Sampling with 1D Convolutional Neural Network for Human Action Recognition
title_sort fine-tuned temporal dense sampling with 1d convolutional neural network for human action recognition
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10256091/
https://www.ncbi.nlm.nih.gov/pubmed/37300004
http://dx.doi.org/10.3390/s23115276
work_keys_str_mv AT limkianming finetunedtemporaldensesamplingwith1dconvolutionalneuralnetworkforhumanactionrecognition
AT leechinpoo finetunedtemporaldensesamplingwith1dconvolutionalneuralnetworkforhumanactionrecognition
AT tankokseang finetunedtemporaldensesamplingwith1dconvolutionalneuralnetworkforhumanactionrecognition
AT alqahtaniali finetunedtemporaldensesamplingwith1dconvolutionalneuralnetworkforhumanactionrecognition
AT alimohammed finetunedtemporaldensesamplingwith1dconvolutionalneuralnetworkforhumanactionrecognition