Cargando…

Alpha-SGANet: A multi-attention-scale feature pyramid network combined with lightweight network based on Alpha-IoU loss

The design of deep convolutional neural networks has resulted in significant advances and successes in the field of object detection. However, despite these achievements, the high computational and memory costs of such object detection networks on the edge or in mobile scenarios are one of the most...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Hong, Zhou, Qian, Mao, Yao, Zhang, Bing, Liu, Chao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9612525/
https://www.ncbi.nlm.nih.gov/pubmed/36301900
http://dx.doi.org/10.1371/journal.pone.0276581
Descripción
Sumario:The design of deep convolutional neural networks has resulted in significant advances and successes in the field of object detection. However, despite these achievements, the high computational and memory costs of such object detection networks on the edge or in mobile scenarios are one of the most significant barriers to their broad adoption. To solve this problem, this paper introduces an improved lightweight real-time convolutional neural network based on YOLOv5, called Alpha-SGANet: A multi-attention-scale feature pyramid network combined with a lightweight network based on Alpha-IoU loss. Firstly, we add one more prediction head to detect different-scale objects, design a lightweight and efficient feature extraction network using ShuffleNetV2 in the backbone, and reduce information loss using the SPP module with a smaller convolutional nucleus. Then, cleverly, employ GAFPN to improve feature transition processing in the neck region, including the usage of the Ghost module to construct efficient feature maps to help prediction. The CBAM module was further integrated to find areas of interest in the scene; finally, combined with Alpha-IOU loss for model supervision training, the biggest performance improvement was achieved. The experiment results show that, compared with YOLOv5s, our proposed method can achieve higher accuracy with fewer parameters and has real-time speed through verification on the PASCAL VOC dataset and MS COCO dataset.