Cargando…
HPnet: Hybrid Parallel Network for Human Pose Estimation
Hybrid models which combine the convolution and transformer model achieve impressive performance on human pose estimation. However, the existing hybrid models on human pose estimation, which typically stack self-attention modules after convolution, are prone to mutual conflict. The mutual conflict e...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10181615/ https://www.ncbi.nlm.nih.gov/pubmed/37177628 http://dx.doi.org/10.3390/s23094425 |
_version_ | 1785041616689430528 |
---|---|
author | Li, Haoran Yao, Hongxun Hou, Yuxin |
author_facet | Li, Haoran Yao, Hongxun Hou, Yuxin |
author_sort | Li, Haoran |
collection | PubMed |
description | Hybrid models which combine the convolution and transformer model achieve impressive performance on human pose estimation. However, the existing hybrid models on human pose estimation, which typically stack self-attention modules after convolution, are prone to mutual conflict. The mutual conflict enforces one type of module to dominate over these hybrid sequential models. Consequently, the performance of higher-precision keypoints localization is not consistent with overall performance. To alleviate this mutual conflict, we developed a hybrid parallel network by parallelizing the self-attention modules and the convolution modules, which conduce to leverage the complementary capabilities effectively. The parallel network ensures that the self-attention branch tends to model the long-range dependency to enhance the semantic representation, whereas the local sensitivity of the convolution branch contributes to high-precision localization simultaneously. To further mitigate the conflict, we proposed a cross-branches attention module to gate the features generated by both branches along the channel dimension. The hybrid parallel network achieves [Formula: see text] and [Formula: see text] AP on COCO validation and test-dev sets and achieves consistent performance on both higher-precision localization and overall performance. The experiments show that our hybrid parallel network is on par with the state-of-the-art human pose estimation models. |
format | Online Article Text |
id | pubmed-10181615 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-101816152023-05-13 HPnet: Hybrid Parallel Network for Human Pose Estimation Li, Haoran Yao, Hongxun Hou, Yuxin Sensors (Basel) Article Hybrid models which combine the convolution and transformer model achieve impressive performance on human pose estimation. However, the existing hybrid models on human pose estimation, which typically stack self-attention modules after convolution, are prone to mutual conflict. The mutual conflict enforces one type of module to dominate over these hybrid sequential models. Consequently, the performance of higher-precision keypoints localization is not consistent with overall performance. To alleviate this mutual conflict, we developed a hybrid parallel network by parallelizing the self-attention modules and the convolution modules, which conduce to leverage the complementary capabilities effectively. The parallel network ensures that the self-attention branch tends to model the long-range dependency to enhance the semantic representation, whereas the local sensitivity of the convolution branch contributes to high-precision localization simultaneously. To further mitigate the conflict, we proposed a cross-branches attention module to gate the features generated by both branches along the channel dimension. The hybrid parallel network achieves [Formula: see text] and [Formula: see text] AP on COCO validation and test-dev sets and achieves consistent performance on both higher-precision localization and overall performance. The experiments show that our hybrid parallel network is on par with the state-of-the-art human pose estimation models. MDPI 2023-04-30 /pmc/articles/PMC10181615/ /pubmed/37177628 http://dx.doi.org/10.3390/s23094425 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Li, Haoran Yao, Hongxun Hou, Yuxin HPnet: Hybrid Parallel Network for Human Pose Estimation |
title | HPnet: Hybrid Parallel Network for Human Pose Estimation |
title_full | HPnet: Hybrid Parallel Network for Human Pose Estimation |
title_fullStr | HPnet: Hybrid Parallel Network for Human Pose Estimation |
title_full_unstemmed | HPnet: Hybrid Parallel Network for Human Pose Estimation |
title_short | HPnet: Hybrid Parallel Network for Human Pose Estimation |
title_sort | hpnet: hybrid parallel network for human pose estimation |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10181615/ https://www.ncbi.nlm.nih.gov/pubmed/37177628 http://dx.doi.org/10.3390/s23094425 |
work_keys_str_mv | AT lihaoran hpnethybridparallelnetworkforhumanposeestimation AT yaohongxun hpnethybridparallelnetworkforhumanposeestimation AT houyuxin hpnethybridparallelnetworkforhumanposeestimation |