Cargando…

PA-Tran: Learning to Estimate 3D Hand Pose with Partial Annotation

This paper tackles a novel and challenging problem—3D hand pose estimation (HPE) from a single RGB image using partial annotation. Most HPE methods ignore the fact that the keypoints could be partially visible (e.g., under occlusions). In contrast, we propose a deep-learning framework, PA-Tran, that...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yu, Tianze, Bidulka, Luke, McKeown, Martin J., Wang, Z. Jane
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9919574/ https://www.ncbi.nlm.nih.gov/pubmed/36772595 http://dx.doi.org/10.3390/s23031555

_version_	1784886856321597440
author	Yu, Tianze Bidulka, Luke McKeown, Martin J. Wang, Z. Jane
author_facet	Yu, Tianze Bidulka, Luke McKeown, Martin J. Wang, Z. Jane
author_sort	Yu, Tianze
collection	PubMed
description	This paper tackles a novel and challenging problem—3D hand pose estimation (HPE) from a single RGB image using partial annotation. Most HPE methods ignore the fact that the keypoints could be partially visible (e.g., under occlusions). In contrast, we propose a deep-learning framework, PA-Tran, that jointly estimates the keypoints status and 3D hand pose from a single RGB image with two dependent branches. The regression branch consists of a Transformer encoder which is trained to predict a set of target keypoints, given an input set of status, position, and visual features embedding from a convolutional neural network (CNN); the classification branch adopts a CNN for estimating the keypoints status. One key idea of PA-Tran is a selective mask training (SMT) objective that uses a binary encoding scheme to represent the status of the keypoints as observed or unobserved during training. In addition, by explicitly encoding the label status (observed/unobserved), the proposed PA-Tran can efficiently handle the condition when only partial annotation is available. Investigating the annotation percentage ranging from 50–100%, we show that training with partial annotation is more efficient (e.g., achieving the best 6.0 PA-MPJPE when using about 85% annotations). Moreover, we provide two new datasets. APDM-Hand, is for synthetic hands with APDM sensor accessories, which is designed for a specific hand task. PD-APDM-Hand, is a real hand dataset collected from Parkinson’s Disease (PD) patients with partial annotation. The proposed PA-Tran can achieve higher estimation accuracy when evaluated on both proposed datasets and a more general hand dataset.
format	Online Article Text
id	pubmed-9919574
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-99195742023-02-12 PA-Tran: Learning to Estimate 3D Hand Pose with Partial Annotation Yu, Tianze Bidulka, Luke McKeown, Martin J. Wang, Z. Jane Sensors (Basel) Article This paper tackles a novel and challenging problem—3D hand pose estimation (HPE) from a single RGB image using partial annotation. Most HPE methods ignore the fact that the keypoints could be partially visible (e.g., under occlusions). In contrast, we propose a deep-learning framework, PA-Tran, that jointly estimates the keypoints status and 3D hand pose from a single RGB image with two dependent branches. The regression branch consists of a Transformer encoder which is trained to predict a set of target keypoints, given an input set of status, position, and visual features embedding from a convolutional neural network (CNN); the classification branch adopts a CNN for estimating the keypoints status. One key idea of PA-Tran is a selective mask training (SMT) objective that uses a binary encoding scheme to represent the status of the keypoints as observed or unobserved during training. In addition, by explicitly encoding the label status (observed/unobserved), the proposed PA-Tran can efficiently handle the condition when only partial annotation is available. Investigating the annotation percentage ranging from 50–100%, we show that training with partial annotation is more efficient (e.g., achieving the best 6.0 PA-MPJPE when using about 85% annotations). Moreover, we provide two new datasets. APDM-Hand, is for synthetic hands with APDM sensor accessories, which is designed for a specific hand task. PD-APDM-Hand, is a real hand dataset collected from Parkinson’s Disease (PD) patients with partial annotation. The proposed PA-Tran can achieve higher estimation accuracy when evaluated on both proposed datasets and a more general hand dataset. MDPI 2023-01-31 /pmc/articles/PMC9919574/ /pubmed/36772595 http://dx.doi.org/10.3390/s23031555 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Yu, Tianze Bidulka, Luke McKeown, Martin J. Wang, Z. Jane PA-Tran: Learning to Estimate 3D Hand Pose with Partial Annotation
title	PA-Tran: Learning to Estimate 3D Hand Pose with Partial Annotation
title_full	PA-Tran: Learning to Estimate 3D Hand Pose with Partial Annotation
title_fullStr	PA-Tran: Learning to Estimate 3D Hand Pose with Partial Annotation
title_full_unstemmed	PA-Tran: Learning to Estimate 3D Hand Pose with Partial Annotation
title_short	PA-Tran: Learning to Estimate 3D Hand Pose with Partial Annotation
title_sort	pa-tran: learning to estimate 3d hand pose with partial annotation
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9919574/ https://www.ncbi.nlm.nih.gov/pubmed/36772595 http://dx.doi.org/10.3390/s23031555
work_keys_str_mv	AT yutianze patranlearningtoestimate3dhandposewithpartialannotation AT bidulkaluke patranlearningtoestimate3dhandposewithpartialannotation AT mckeownmartinj patranlearningtoestimate3dhandposewithpartialannotation AT wangzjane patranlearningtoestimate3dhandposewithpartialannotation

PA-Tran: Learning to Estimate 3D Hand Pose with Partial Annotation

Ejemplares similares