Cargando…
Top-Down System for Multi-Person 3D Absolute Pose Estimation from Monocular Videos
Two-dimensional (2D) multi-person pose estimation and three-dimensional (3D) root-relative pose estimation from a monocular RGB camera have made significant progress recently. Yet, real-world applications require depth estimations and the ability to determine the distances between people in a scene....
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9185275/ https://www.ncbi.nlm.nih.gov/pubmed/35684728 http://dx.doi.org/10.3390/s22114109 |
_version_ | 1784724684208603136 |
---|---|
author | El Kaid, Amal Brazey, Denis Barra, Vincent Baïna, Karim |
author_facet | El Kaid, Amal Brazey, Denis Barra, Vincent Baïna, Karim |
author_sort | El Kaid, Amal |
collection | PubMed |
description | Two-dimensional (2D) multi-person pose estimation and three-dimensional (3D) root-relative pose estimation from a monocular RGB camera have made significant progress recently. Yet, real-world applications require depth estimations and the ability to determine the distances between people in a scene. Therefore, it is necessary to recover the 3D absolute poses of several people. However, this is still a challenge when using cameras from single points of view. Furthermore, the previously proposed systems typically required a significant amount of resources and memory. To overcome these restrictions, we herein propose a real-time framework for multi-person 3D absolute pose estimation from a monocular camera, which integrates a human detector, a 2D pose estimator, a 3D root-relative pose reconstructor, and a root depth estimator in a top-down manner. The proposed system, called Root-GAST-Net, is based on modified versions of GAST-Net and RootNet networks. The efficiency of the proposed Root-GAST-Net system is demonstrated through quantitative and qualitative evaluations on two benchmark datasets, Human3.6M and MuPoTS-3D. On all evaluated metrics, our experimental results on the MuPoTS-3D dataset outperform the current state-of-the-art by a significant margin, and can run in real-time at 15 fps on the Nvidia GeForce GTX 1080. |
format | Online Article Text |
id | pubmed-9185275 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-91852752022-06-11 Top-Down System for Multi-Person 3D Absolute Pose Estimation from Monocular Videos El Kaid, Amal Brazey, Denis Barra, Vincent Baïna, Karim Sensors (Basel) Article Two-dimensional (2D) multi-person pose estimation and three-dimensional (3D) root-relative pose estimation from a monocular RGB camera have made significant progress recently. Yet, real-world applications require depth estimations and the ability to determine the distances between people in a scene. Therefore, it is necessary to recover the 3D absolute poses of several people. However, this is still a challenge when using cameras from single points of view. Furthermore, the previously proposed systems typically required a significant amount of resources and memory. To overcome these restrictions, we herein propose a real-time framework for multi-person 3D absolute pose estimation from a monocular camera, which integrates a human detector, a 2D pose estimator, a 3D root-relative pose reconstructor, and a root depth estimator in a top-down manner. The proposed system, called Root-GAST-Net, is based on modified versions of GAST-Net and RootNet networks. The efficiency of the proposed Root-GAST-Net system is demonstrated through quantitative and qualitative evaluations on two benchmark datasets, Human3.6M and MuPoTS-3D. On all evaluated metrics, our experimental results on the MuPoTS-3D dataset outperform the current state-of-the-art by a significant margin, and can run in real-time at 15 fps on the Nvidia GeForce GTX 1080. MDPI 2022-05-28 /pmc/articles/PMC9185275/ /pubmed/35684728 http://dx.doi.org/10.3390/s22114109 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article El Kaid, Amal Brazey, Denis Barra, Vincent Baïna, Karim Top-Down System for Multi-Person 3D Absolute Pose Estimation from Monocular Videos |
title | Top-Down System for Multi-Person 3D Absolute Pose Estimation from Monocular Videos |
title_full | Top-Down System for Multi-Person 3D Absolute Pose Estimation from Monocular Videos |
title_fullStr | Top-Down System for Multi-Person 3D Absolute Pose Estimation from Monocular Videos |
title_full_unstemmed | Top-Down System for Multi-Person 3D Absolute Pose Estimation from Monocular Videos |
title_short | Top-Down System for Multi-Person 3D Absolute Pose Estimation from Monocular Videos |
title_sort | top-down system for multi-person 3d absolute pose estimation from monocular videos |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9185275/ https://www.ncbi.nlm.nih.gov/pubmed/35684728 http://dx.doi.org/10.3390/s22114109 |
work_keys_str_mv | AT elkaidamal topdownsystemformultiperson3dabsoluteposeestimationfrommonocularvideos AT brazeydenis topdownsystemformultiperson3dabsoluteposeestimationfrommonocularvideos AT barravincent topdownsystemformultiperson3dabsoluteposeestimationfrommonocularvideos AT bainakarim topdownsystemformultiperson3dabsoluteposeestimationfrommonocularvideos |