Cargando…

Skeleton Driven Action Recognition Using an Image-Based Spatial-Temporal Representation and Convolution Neural Network

Individuals with Autism Spectrum Disorder (ASD) typically present difficulties in engaging and interacting with their peers. Thus, researchers have been developing different technological solutions as support tools for children with ASD. Social robots, one example of these technological solutions, a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Silva, Vinícius, Soares, Filomena, Leão, Celina P., Esteves, João Sena, Vercelli, Gianni
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8271982/ https://www.ncbi.nlm.nih.gov/pubmed/34201991 http://dx.doi.org/10.3390/s21134342

_version_	1783721118921654272
author	Silva, Vinícius Soares, Filomena Leão, Celina P. Esteves, João Sena Vercelli, Gianni
author_facet	Silva, Vinícius Soares, Filomena Leão, Celina P. Esteves, João Sena Vercelli, Gianni
author_sort	Silva, Vinícius
collection	PubMed
description	Individuals with Autism Spectrum Disorder (ASD) typically present difficulties in engaging and interacting with their peers. Thus, researchers have been developing different technological solutions as support tools for children with ASD. Social robots, one example of these technological solutions, are often unaware of their game partners, preventing the automatic adaptation of their behavior to the user. Information that can be used to enrich this interaction and, consequently, adapt the system behavior is the recognition of different actions of the user by using RGB cameras or/and depth sensors. The present work proposes a method to automatically detect in real-time typical and stereotypical actions of children with ASD by using the Intel RealSense and the Nuitrack SDK to detect and extract the user joint coordinates. The pipeline starts by mapping the temporal and spatial joints dynamics onto a color image-based representation. Usually, the position of the joints in the final image is clustered into groups. In order to verify if the sequence of the joints in the final image representation can influence the model’s performance, two main experiments were conducted where in the first, the order of the grouped joints in the sequence was changed, and in the second, the joints were randomly ordered. In each experiment, statistical methods were used in the analysis. Based on the experiments conducted, it was found statistically significant differences concerning the joints sequence in the image, indicating that the order of the joints might impact the model’s performance. The final model, a Convolutional Neural Network (CNN), trained on the different actions (typical and stereotypical), was used to classify the different patterns of behavior, achieving a mean accuracy of 92.4% ± 0.0% on the test data. The entire pipeline ran on average at 31 FPS.
format	Online Article Text
id	pubmed-8271982
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-82719822021-07-11 Skeleton Driven Action Recognition Using an Image-Based Spatial-Temporal Representation and Convolution Neural Network Silva, Vinícius Soares, Filomena Leão, Celina P. Esteves, João Sena Vercelli, Gianni Sensors (Basel) Article Individuals with Autism Spectrum Disorder (ASD) typically present difficulties in engaging and interacting with their peers. Thus, researchers have been developing different technological solutions as support tools for children with ASD. Social robots, one example of these technological solutions, are often unaware of their game partners, preventing the automatic adaptation of their behavior to the user. Information that can be used to enrich this interaction and, consequently, adapt the system behavior is the recognition of different actions of the user by using RGB cameras or/and depth sensors. The present work proposes a method to automatically detect in real-time typical and stereotypical actions of children with ASD by using the Intel RealSense and the Nuitrack SDK to detect and extract the user joint coordinates. The pipeline starts by mapping the temporal and spatial joints dynamics onto a color image-based representation. Usually, the position of the joints in the final image is clustered into groups. In order to verify if the sequence of the joints in the final image representation can influence the model’s performance, two main experiments were conducted where in the first, the order of the grouped joints in the sequence was changed, and in the second, the joints were randomly ordered. In each experiment, statistical methods were used in the analysis. Based on the experiments conducted, it was found statistically significant differences concerning the joints sequence in the image, indicating that the order of the joints might impact the model’s performance. The final model, a Convolutional Neural Network (CNN), trained on the different actions (typical and stereotypical), was used to classify the different patterns of behavior, achieving a mean accuracy of 92.4% ± 0.0% on the test data. The entire pipeline ran on average at 31 FPS. MDPI 2021-06-25 /pmc/articles/PMC8271982/ /pubmed/34201991 http://dx.doi.org/10.3390/s21134342 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Silva, Vinícius Soares, Filomena Leão, Celina P. Esteves, João Sena Vercelli, Gianni Skeleton Driven Action Recognition Using an Image-Based Spatial-Temporal Representation and Convolution Neural Network
title	Skeleton Driven Action Recognition Using an Image-Based Spatial-Temporal Representation and Convolution Neural Network
title_full	Skeleton Driven Action Recognition Using an Image-Based Spatial-Temporal Representation and Convolution Neural Network
title_fullStr	Skeleton Driven Action Recognition Using an Image-Based Spatial-Temporal Representation and Convolution Neural Network
title_full_unstemmed	Skeleton Driven Action Recognition Using an Image-Based Spatial-Temporal Representation and Convolution Neural Network
title_short	Skeleton Driven Action Recognition Using an Image-Based Spatial-Temporal Representation and Convolution Neural Network
title_sort	skeleton driven action recognition using an image-based spatial-temporal representation and convolution neural network
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8271982/ https://www.ncbi.nlm.nih.gov/pubmed/34201991 http://dx.doi.org/10.3390/s21134342
work_keys_str_mv	AT silvavinicius skeletondrivenactionrecognitionusinganimagebasedspatialtemporalrepresentationandconvolutionneuralnetwork AT soaresfilomena skeletondrivenactionrecognitionusinganimagebasedspatialtemporalrepresentationandconvolutionneuralnetwork AT leaocelinap skeletondrivenactionrecognitionusinganimagebasedspatialtemporalrepresentationandconvolutionneuralnetwork AT estevesjoaosena skeletondrivenactionrecognitionusinganimagebasedspatialtemporalrepresentationandconvolutionneuralnetwork AT vercelligianni skeletondrivenactionrecognitionusinganimagebasedspatialtemporalrepresentationandconvolutionneuralnetwork

Skeleton Driven Action Recognition Using an Image-Based Spatial-Temporal Representation and Convolution Neural Network

Ejemplares similares