Cargando…

Multi-Object Detection in Security Screening Scene Based on Convolutional Neural Network

The technique for target detection based on a convolutional neural network has been widely implemented in the industry. However, the detection accuracy of X-ray images in security screening scenarios still requires improvement. This paper proposes a coupled multi-scale feature extraction and multi-s...

Descripción completa

Detalles Bibliográficos
Autores principales: Sun, Fan, Zhang, Xiangfeng, Liu, Yunzhong, Jiang, Hong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9611169/
https://www.ncbi.nlm.nih.gov/pubmed/36298187
http://dx.doi.org/10.3390/s22207836
Descripción
Sumario:The technique for target detection based on a convolutional neural network has been widely implemented in the industry. However, the detection accuracy of X-ray images in security screening scenarios still requires improvement. This paper proposes a coupled multi-scale feature extraction and multi-scale attention architecture. We integrate this architecture into the Single Shot MultiBox Detector (SSD) algorithm and find that it can significantly improve the effectiveness of target detection. Firstly, ResNet is used as the backbone network to replace the original VGG network to improve the feature extraction capability of the convolutional neural network for images. Secondly, a multi-scale feature extraction (MSE) structure is designed to enrich the information contained in the multi-stage prediction feature layer. Finally, the multi-scale attention architecture (MSA) is fused onto the prediction feature layer to eliminate the redundant features’ interference and extract effective contextual information. In addition, a combination of Adaptive-NMS and Soft-NMS is used to output the final prediction anchor boxes when performing non-maximum suppression. The results of the experiments show that the improved method improves the mean average precision (mAP) value by 7.4% compared to the original approach. New modules make detection much more accurate while keeping the detection speed the same.