Cargando…

Efficient-Lightweight YOLO: Improving Small Object Detection in YOLO for Aerial Images

The most significant technical challenges of current aerial image object-detection tasks are the extremely low accuracy for detecting small objects that are densely distributed within a scene and the lack of semantic information. Moreover, existing detectors with large parameter scales are unsuitabl...

Descripción completa

Detalles Bibliográficos
Autores principales: Hu, Mengzi, Li, Ziyang, Yu, Jiong, Wan, Xueqiang, Tan, Haotian, Lin, Zeyu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10385816/
https://www.ncbi.nlm.nih.gov/pubmed/37514717
http://dx.doi.org/10.3390/s23146423
Descripción
Sumario:The most significant technical challenges of current aerial image object-detection tasks are the extremely low accuracy for detecting small objects that are densely distributed within a scene and the lack of semantic information. Moreover, existing detectors with large parameter scales are unsuitable for aerial image object-detection scenarios oriented toward low-end GPUs. To address this technical challenge, we propose efficient-lightweight You Only Look Once (EL-YOLO), an innovative model that overcomes the limitations of existing detectors and low-end GPU orientation. EL-YOLO surpasses the baseline models in three key areas. Firstly, we design and scrutinize three model architectures to intensify the model’s focus on small objects and identify the most effective network structure. Secondly, we design efficient spatial pyramid pooling (ESPP) to augment the representation of small-object features in aerial images. Lastly, we introduce the alpha-complete intersection over union (α-CIoU) loss function to tackle the imbalance between positive and negative samples in aerial images. Our proposed EL-YOLO method demonstrates a strong generalization and robustness for the small-object detection problem in aerial images. The experimental results show that, with the model parameters maintained below 10 M while the input image size was unified at 640 × 640 pixels, the AP(S) of the EL-YOLOv5 reached 10.8% and 10.7% and enhanced the APs by 1.9% and 2.2% compared to YOLOv5 on two challenging aerial image datasets, DIOR and VisDrone, respectively.