Cargando…

Shift Pose: A Lightweight Transformer-like Neural Network for Human Pose Estimation

High-performing, real-time pose detection and tracking in real-time will enable computers to develop a finer-grained and more natural understanding of human behavior. However, the implementation of real-time human pose estimation remains a challenge. On the one hand, the performance of semantic keyp...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Haijian, Jiang, Xinyun, Dai, Yonghui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9570652/
https://www.ncbi.nlm.nih.gov/pubmed/36236362
http://dx.doi.org/10.3390/s22197264
_version_ 1784810163660652544
author Chen, Haijian
Jiang, Xinyun
Dai, Yonghui
author_facet Chen, Haijian
Jiang, Xinyun
Dai, Yonghui
author_sort Chen, Haijian
collection PubMed
description High-performing, real-time pose detection and tracking in real-time will enable computers to develop a finer-grained and more natural understanding of human behavior. However, the implementation of real-time human pose estimation remains a challenge. On the one hand, the performance of semantic keypoint tracking in live video footage requires high computational resources and large parameters, which limiting the accuracy of pose estimation. On the other hand, some transformer-based models were proposed recently with outstanding performance and much fewer parameters and FLOPs. However, the self-attention module in the transformer is not computationally friendly, which makes it difficult to apply these excellent models to real-time jobs. To overcome the above problems, we propose a transformer-like model, named ShiftPose, which is regression-based approach. The ShiftPose does not contain any self-attention module. Instead, we replace the self-attention module with a non-parameter operation called the shift operator. Meanwhile, we adapt the bridge–branch connection, instead of a fully-branched connection, such as HRNet, as our multi-resolution integration scheme. Specifically, the bottom half of our model adds the previous output, as well as the output from the top half of our model, corresponding to its resolution. Finally, the simple, yet promising, disentangled representation (SimDR) was used in our study to make the training process more stable. The experimental results on the MPII datasets were 86.4 PCKH, 29.1PCKH@0.1. On the COCO dataset, the results were 72.2 mAP and 91.5 AP50, 255 fps on GPU, with 10.2M parameters, and 1.6 GFLOPs. In addition, we tested our model for single-stage 3D human pose estimation and draw several useful and exploratory conclusions. The above results show good performance, and this paper provides a new method for high-performance, real-time attitude detection and tracking.
format Online
Article
Text
id pubmed-9570652
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-95706522022-10-17 Shift Pose: A Lightweight Transformer-like Neural Network for Human Pose Estimation Chen, Haijian Jiang, Xinyun Dai, Yonghui Sensors (Basel) Article High-performing, real-time pose detection and tracking in real-time will enable computers to develop a finer-grained and more natural understanding of human behavior. However, the implementation of real-time human pose estimation remains a challenge. On the one hand, the performance of semantic keypoint tracking in live video footage requires high computational resources and large parameters, which limiting the accuracy of pose estimation. On the other hand, some transformer-based models were proposed recently with outstanding performance and much fewer parameters and FLOPs. However, the self-attention module in the transformer is not computationally friendly, which makes it difficult to apply these excellent models to real-time jobs. To overcome the above problems, we propose a transformer-like model, named ShiftPose, which is regression-based approach. The ShiftPose does not contain any self-attention module. Instead, we replace the self-attention module with a non-parameter operation called the shift operator. Meanwhile, we adapt the bridge–branch connection, instead of a fully-branched connection, such as HRNet, as our multi-resolution integration scheme. Specifically, the bottom half of our model adds the previous output, as well as the output from the top half of our model, corresponding to its resolution. Finally, the simple, yet promising, disentangled representation (SimDR) was used in our study to make the training process more stable. The experimental results on the MPII datasets were 86.4 PCKH, 29.1PCKH@0.1. On the COCO dataset, the results were 72.2 mAP and 91.5 AP50, 255 fps on GPU, with 10.2M parameters, and 1.6 GFLOPs. In addition, we tested our model for single-stage 3D human pose estimation and draw several useful and exploratory conclusions. The above results show good performance, and this paper provides a new method for high-performance, real-time attitude detection and tracking. MDPI 2022-09-25 /pmc/articles/PMC9570652/ /pubmed/36236362 http://dx.doi.org/10.3390/s22197264 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Chen, Haijian
Jiang, Xinyun
Dai, Yonghui
Shift Pose: A Lightweight Transformer-like Neural Network for Human Pose Estimation
title Shift Pose: A Lightweight Transformer-like Neural Network for Human Pose Estimation
title_full Shift Pose: A Lightweight Transformer-like Neural Network for Human Pose Estimation
title_fullStr Shift Pose: A Lightweight Transformer-like Neural Network for Human Pose Estimation
title_full_unstemmed Shift Pose: A Lightweight Transformer-like Neural Network for Human Pose Estimation
title_short Shift Pose: A Lightweight Transformer-like Neural Network for Human Pose Estimation
title_sort shift pose: a lightweight transformer-like neural network for human pose estimation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9570652/
https://www.ncbi.nlm.nih.gov/pubmed/36236362
http://dx.doi.org/10.3390/s22197264
work_keys_str_mv AT chenhaijian shiftposealightweighttransformerlikeneuralnetworkforhumanposeestimation
AT jiangxinyun shiftposealightweighttransformerlikeneuralnetworkforhumanposeestimation
AT daiyonghui shiftposealightweighttransformerlikeneuralnetworkforhumanposeestimation