Cargando…

HPnet: Hybrid Parallel Network for Human Pose Estimation

Hybrid models which combine the convolution and transformer model achieve impressive performance on human pose estimation. However, the existing hybrid models on human pose estimation, which typically stack self-attention modules after convolution, are prone to mutual conflict. The mutual conflict e...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Haoran, Yao, Hongxun, Hou, Yuxin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10181615/
https://www.ncbi.nlm.nih.gov/pubmed/37177628
http://dx.doi.org/10.3390/s23094425
_version_ 1785041616689430528
author Li, Haoran
Yao, Hongxun
Hou, Yuxin
author_facet Li, Haoran
Yao, Hongxun
Hou, Yuxin
author_sort Li, Haoran
collection PubMed
description Hybrid models which combine the convolution and transformer model achieve impressive performance on human pose estimation. However, the existing hybrid models on human pose estimation, which typically stack self-attention modules after convolution, are prone to mutual conflict. The mutual conflict enforces one type of module to dominate over these hybrid sequential models. Consequently, the performance of higher-precision keypoints localization is not consistent with overall performance. To alleviate this mutual conflict, we developed a hybrid parallel network by parallelizing the self-attention modules and the convolution modules, which conduce to leverage the complementary capabilities effectively. The parallel network ensures that the self-attention branch tends to model the long-range dependency to enhance the semantic representation, whereas the local sensitivity of the convolution branch contributes to high-precision localization simultaneously. To further mitigate the conflict, we proposed a cross-branches attention module to gate the features generated by both branches along the channel dimension. The hybrid parallel network achieves [Formula: see text] and [Formula: see text] AP on COCO validation and test-dev sets and achieves consistent performance on both higher-precision localization and overall performance. The experiments show that our hybrid parallel network is on par with the state-of-the-art human pose estimation models.
format Online
Article
Text
id pubmed-10181615
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-101816152023-05-13 HPnet: Hybrid Parallel Network for Human Pose Estimation Li, Haoran Yao, Hongxun Hou, Yuxin Sensors (Basel) Article Hybrid models which combine the convolution and transformer model achieve impressive performance on human pose estimation. However, the existing hybrid models on human pose estimation, which typically stack self-attention modules after convolution, are prone to mutual conflict. The mutual conflict enforces one type of module to dominate over these hybrid sequential models. Consequently, the performance of higher-precision keypoints localization is not consistent with overall performance. To alleviate this mutual conflict, we developed a hybrid parallel network by parallelizing the self-attention modules and the convolution modules, which conduce to leverage the complementary capabilities effectively. The parallel network ensures that the self-attention branch tends to model the long-range dependency to enhance the semantic representation, whereas the local sensitivity of the convolution branch contributes to high-precision localization simultaneously. To further mitigate the conflict, we proposed a cross-branches attention module to gate the features generated by both branches along the channel dimension. The hybrid parallel network achieves [Formula: see text] and [Formula: see text] AP on COCO validation and test-dev sets and achieves consistent performance on both higher-precision localization and overall performance. The experiments show that our hybrid parallel network is on par with the state-of-the-art human pose estimation models. MDPI 2023-04-30 /pmc/articles/PMC10181615/ /pubmed/37177628 http://dx.doi.org/10.3390/s23094425 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Li, Haoran
Yao, Hongxun
Hou, Yuxin
HPnet: Hybrid Parallel Network for Human Pose Estimation
title HPnet: Hybrid Parallel Network for Human Pose Estimation
title_full HPnet: Hybrid Parallel Network for Human Pose Estimation
title_fullStr HPnet: Hybrid Parallel Network for Human Pose Estimation
title_full_unstemmed HPnet: Hybrid Parallel Network for Human Pose Estimation
title_short HPnet: Hybrid Parallel Network for Human Pose Estimation
title_sort hpnet: hybrid parallel network for human pose estimation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10181615/
https://www.ncbi.nlm.nih.gov/pubmed/37177628
http://dx.doi.org/10.3390/s23094425
work_keys_str_mv AT lihaoran hpnethybridparallelnetworkforhumanposeestimation
AT yaohongxun hpnethybridparallelnetworkforhumanposeestimation
AT houyuxin hpnethybridparallelnetworkforhumanposeestimation