Cargando…
AIE-YOLO: Auxiliary Information Enhanced YOLO for Small Object Detection
Small object detection is one of the key challenges in the current computer vision field due to the low amount of information carried and the information loss caused by feature extraction. You Only Look Once v5 (YOLOv5) adopts the Path Aggregation Network to alleviate the problem of information loss...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9658690/ https://www.ncbi.nlm.nih.gov/pubmed/36365919 http://dx.doi.org/10.3390/s22218221 |
Sumario: | Small object detection is one of the key challenges in the current computer vision field due to the low amount of information carried and the information loss caused by feature extraction. You Only Look Once v5 (YOLOv5) adopts the Path Aggregation Network to alleviate the problem of information loss, but it cannot restore the information that has been lost. To this end, an auxiliary information-enhanced YOLO is proposed to improve the sensitivity and detection performance of YOLOv5 to small objects. Firstly, a context enhancement module containing a receptive field size of 21×21 is proposed, which captures the global and local information of the image by fusing multi-scale receptive fields, and introduces an attention branch to enhance the expressive ability of key features and suppress background noise. To further enhance the feature expression ability of small objects, we introduce the high- and low-frequency information decomposed by wavelet transform into PANet to participate in multi-scale feature fusion, so as to solve the problem that the features of small objects gradually disappear after multiple downsampling and pooling operations. Experiments on the challenging dataset Tsinghua–Tencent 100 K show that the mean average precision of the proposed model is 9.5% higher than that of the original YOLOv5 while maintaining the real-time speed, which is better than the mainstream object detection models. |
---|