Cargando…

Unified End-to-End YOLOv5-HR-TCM Framework for Automatic 2D/3D Human Pose Estimation for Real-Time Applications

Three-dimensional human pose estimation is widely applied in sports, robotics, and healthcare. In the past five years, the number of CNN-based studies for 3D human pose estimation has been numerous and has yielded impressive results. However, studies often focus only on improving the accuracy of the...

Descripción completa

Detalles Bibliográficos
Autores principales: Nguyen, Hung-Cuong, Nguyen, Thi-Hao, Scherer, Rafal, Le, Van-Hung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9315644/
https://www.ncbi.nlm.nih.gov/pubmed/35891099
http://dx.doi.org/10.3390/s22145419
_version_ 1784754613163917312
author Nguyen, Hung-Cuong
Nguyen, Thi-Hao
Scherer, Rafal
Le, Van-Hung
author_facet Nguyen, Hung-Cuong
Nguyen, Thi-Hao
Scherer, Rafal
Le, Van-Hung
author_sort Nguyen, Hung-Cuong
collection PubMed
description Three-dimensional human pose estimation is widely applied in sports, robotics, and healthcare. In the past five years, the number of CNN-based studies for 3D human pose estimation has been numerous and has yielded impressive results. However, studies often focus only on improving the accuracy of the estimation results. In this paper, we propose a fast, unified end-to-end model for estimating 3D human pose, called YOLOv5-HR-TCM (YOLOv5-HRet-Temporal Convolution Model). Our proposed model is based on the 2D to 3D lifting approach for 3D human pose estimation while taking care of each step in the estimation process, such as person detection, 2D human pose estimation, and 3D human pose estimation. The proposed model is a combination of best practices at each stage. Our proposed model is evaluated on the Human 3.6M dataset and compared with other methods at each step. The method achieves high accuracy, not sacrificing processing speed. The estimated time of the whole process is 3.146 FPS on a low-end computer. In particular, we propose a sports scoring application based on the deviation angle between the estimated 3D human posture and the standard (reference) origin. The average deviation angle evaluated on the Human 3.6M dataset (Protocol #1–Pro #1) is 8.2 degrees.
format Online
Article
Text
id pubmed-9315644
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-93156442022-07-27 Unified End-to-End YOLOv5-HR-TCM Framework for Automatic 2D/3D Human Pose Estimation for Real-Time Applications Nguyen, Hung-Cuong Nguyen, Thi-Hao Scherer, Rafal Le, Van-Hung Sensors (Basel) Article Three-dimensional human pose estimation is widely applied in sports, robotics, and healthcare. In the past five years, the number of CNN-based studies for 3D human pose estimation has been numerous and has yielded impressive results. However, studies often focus only on improving the accuracy of the estimation results. In this paper, we propose a fast, unified end-to-end model for estimating 3D human pose, called YOLOv5-HR-TCM (YOLOv5-HRet-Temporal Convolution Model). Our proposed model is based on the 2D to 3D lifting approach for 3D human pose estimation while taking care of each step in the estimation process, such as person detection, 2D human pose estimation, and 3D human pose estimation. The proposed model is a combination of best practices at each stage. Our proposed model is evaluated on the Human 3.6M dataset and compared with other methods at each step. The method achieves high accuracy, not sacrificing processing speed. The estimated time of the whole process is 3.146 FPS on a low-end computer. In particular, we propose a sports scoring application based on the deviation angle between the estimated 3D human posture and the standard (reference) origin. The average deviation angle evaluated on the Human 3.6M dataset (Protocol #1–Pro #1) is 8.2 degrees. MDPI 2022-07-20 /pmc/articles/PMC9315644/ /pubmed/35891099 http://dx.doi.org/10.3390/s22145419 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Nguyen, Hung-Cuong
Nguyen, Thi-Hao
Scherer, Rafal
Le, Van-Hung
Unified End-to-End YOLOv5-HR-TCM Framework for Automatic 2D/3D Human Pose Estimation for Real-Time Applications
title Unified End-to-End YOLOv5-HR-TCM Framework for Automatic 2D/3D Human Pose Estimation for Real-Time Applications
title_full Unified End-to-End YOLOv5-HR-TCM Framework for Automatic 2D/3D Human Pose Estimation for Real-Time Applications
title_fullStr Unified End-to-End YOLOv5-HR-TCM Framework for Automatic 2D/3D Human Pose Estimation for Real-Time Applications
title_full_unstemmed Unified End-to-End YOLOv5-HR-TCM Framework for Automatic 2D/3D Human Pose Estimation for Real-Time Applications
title_short Unified End-to-End YOLOv5-HR-TCM Framework for Automatic 2D/3D Human Pose Estimation for Real-Time Applications
title_sort unified end-to-end yolov5-hr-tcm framework for automatic 2d/3d human pose estimation for real-time applications
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9315644/
https://www.ncbi.nlm.nih.gov/pubmed/35891099
http://dx.doi.org/10.3390/s22145419
work_keys_str_mv AT nguyenhungcuong unifiedendtoendyolov5hrtcmframeworkforautomatic2d3dhumanposeestimationforrealtimeapplications
AT nguyenthihao unifiedendtoendyolov5hrtcmframeworkforautomatic2d3dhumanposeestimationforrealtimeapplications
AT schererrafal unifiedendtoendyolov5hrtcmframeworkforautomatic2d3dhumanposeestimationforrealtimeapplications
AT levanhung unifiedendtoendyolov5hrtcmframeworkforautomatic2d3dhumanposeestimationforrealtimeapplications