Cargando…

HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection

The goals of object detection are to accurately detect and locate objects of various sizes in digital images. Multi-scale processing technology can improve the detection accuracy of the detector. Feature pyramid networks (FPNs) have been proven to be effective in extracting multi-scaled features. Ho...

Descripción completa

Detalles Bibliográficos
Autores principales: Dang, Jin, Tang, Xiaofen, Li, Shuai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10181737/
https://www.ncbi.nlm.nih.gov/pubmed/37177710
http://dx.doi.org/10.3390/s23094508
_version_ 1785041646191116288
author Dang, Jin
Tang, Xiaofen
Li, Shuai
author_facet Dang, Jin
Tang, Xiaofen
Li, Shuai
author_sort Dang, Jin
collection PubMed
description The goals of object detection are to accurately detect and locate objects of various sizes in digital images. Multi-scale processing technology can improve the detection accuracy of the detector. Feature pyramid networks (FPNs) have been proven to be effective in extracting multi-scaled features. However, most existing object detection methods recognize objects in isolation, without considering contextual information between objects. Moreover, for the sake of computational efficiency, a significant reduction in the channel dimension may lead to the loss of semantic information. This study explores the utilization of attention mechanisms to augment the representational power and efficiency of features, ultimately improving the accuracy and efficiency of object detection. The study proposed a novel hierarchical attention feature pyramid network (HA-FPN), which comprises two key components: transformer feature pyramid networks (TFPNs) and channel attention modules (CAMs). In TFPNs, multi-scaled convolutional features are embedded as tokens and self-attention is applied to across both the intra- and inter-scales to capture contextual information between the tokens. CAMs are employed to select the channels with rich channel information to alleviate massive channel information losses. By introducing contextual information and attention mechanisms, the HA-FPN significantly improves the accuracy of bounding box detection, leading to more precise identification and localization of target objects. Extensive experiments conducted on the challenging MS COCO dataset demonstrate that the proposed HA-FPN outperforms existing multi-object detection models, while incurring minimal computational overhead.
format Online
Article
Text
id pubmed-10181737
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-101817372023-05-13 HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection Dang, Jin Tang, Xiaofen Li, Shuai Sensors (Basel) Article The goals of object detection are to accurately detect and locate objects of various sizes in digital images. Multi-scale processing technology can improve the detection accuracy of the detector. Feature pyramid networks (FPNs) have been proven to be effective in extracting multi-scaled features. However, most existing object detection methods recognize objects in isolation, without considering contextual information between objects. Moreover, for the sake of computational efficiency, a significant reduction in the channel dimension may lead to the loss of semantic information. This study explores the utilization of attention mechanisms to augment the representational power and efficiency of features, ultimately improving the accuracy and efficiency of object detection. The study proposed a novel hierarchical attention feature pyramid network (HA-FPN), which comprises two key components: transformer feature pyramid networks (TFPNs) and channel attention modules (CAMs). In TFPNs, multi-scaled convolutional features are embedded as tokens and self-attention is applied to across both the intra- and inter-scales to capture contextual information between the tokens. CAMs are employed to select the channels with rich channel information to alleviate massive channel information losses. By introducing contextual information and attention mechanisms, the HA-FPN significantly improves the accuracy of bounding box detection, leading to more precise identification and localization of target objects. Extensive experiments conducted on the challenging MS COCO dataset demonstrate that the proposed HA-FPN outperforms existing multi-object detection models, while incurring minimal computational overhead. MDPI 2023-05-05 /pmc/articles/PMC10181737/ /pubmed/37177710 http://dx.doi.org/10.3390/s23094508 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Dang, Jin
Tang, Xiaofen
Li, Shuai
HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection
title HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection
title_full HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection
title_fullStr HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection
title_full_unstemmed HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection
title_short HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection
title_sort ha-fpn: hierarchical attention feature pyramid network for object detection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10181737/
https://www.ncbi.nlm.nih.gov/pubmed/37177710
http://dx.doi.org/10.3390/s23094508
work_keys_str_mv AT dangjin hafpnhierarchicalattentionfeaturepyramidnetworkforobjectdetection
AT tangxiaofen hafpnhierarchicalattentionfeaturepyramidnetworkforobjectdetection
AT lishuai hafpnhierarchicalattentionfeaturepyramidnetworkforobjectdetection