Cargando…
Vehicle Detection in Urban Traffic Surveillance Images Based on Convolutional Neural Networks with Feature Concatenation
Vehicle detection with category inference on video sequence data is an important but challenging task for urban traffic surveillance. The difficulty of this task lies in the fact that it requires accurate localization of relatively small vehicles in complex scenes and expects real-time detection. In...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6387095/ https://www.ncbi.nlm.nih.gov/pubmed/30704152 http://dx.doi.org/10.3390/s19030594 |
_version_ | 1783397494465495040 |
---|---|
author | Zhang, Fukai Li, Ce Yang, Feng |
author_facet | Zhang, Fukai Li, Ce Yang, Feng |
author_sort | Zhang, Fukai |
collection | PubMed |
description | Vehicle detection with category inference on video sequence data is an important but challenging task for urban traffic surveillance. The difficulty of this task lies in the fact that it requires accurate localization of relatively small vehicles in complex scenes and expects real-time detection. In this paper, we present a vehicle detection framework that improves the performance of the conventional Single Shot MultiBox Detector (SSD), which effectively detects different types of vehicles in real-time. Our approach, which proposes the use of different feature extractors for localization and classification tasks in a single network, and to enhance these two feature extractors through deconvolution (D) and pooling (P) between layers in the feature pyramid, is denoted as DP-SSD. In addition, we extend the scope of the default box by adjusting its scale so that smaller default boxes can be exploited to guide DP-SSD training. Experimental results on the UA-DETRAC and KITTI datasets demonstrate that DP-SSD can achieve efficient vehicle detection for real-world traffic surveillance data in real-time. For the UA-DETRAC test set trained with UA-DETRAC trainval set, DP-SSD with the input size of 300 × 300 achieves 75.43% mAP (mean average precision) at the speed of 50.47 FPS (frames per second), and the framework with a 512 × 512 sized input reaches 77.94% mAP at 25.12 FPS using an NVIDIA GeForce GTX 1080Ti GPU. The DP-SSD shows comparable accuracy, which is better than those of the compared state-of-the-art models, except for YOLOv3. |
format | Online Article Text |
id | pubmed-6387095 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-63870952019-02-26 Vehicle Detection in Urban Traffic Surveillance Images Based on Convolutional Neural Networks with Feature Concatenation Zhang, Fukai Li, Ce Yang, Feng Sensors (Basel) Article Vehicle detection with category inference on video sequence data is an important but challenging task for urban traffic surveillance. The difficulty of this task lies in the fact that it requires accurate localization of relatively small vehicles in complex scenes and expects real-time detection. In this paper, we present a vehicle detection framework that improves the performance of the conventional Single Shot MultiBox Detector (SSD), which effectively detects different types of vehicles in real-time. Our approach, which proposes the use of different feature extractors for localization and classification tasks in a single network, and to enhance these two feature extractors through deconvolution (D) and pooling (P) between layers in the feature pyramid, is denoted as DP-SSD. In addition, we extend the scope of the default box by adjusting its scale so that smaller default boxes can be exploited to guide DP-SSD training. Experimental results on the UA-DETRAC and KITTI datasets demonstrate that DP-SSD can achieve efficient vehicle detection for real-world traffic surveillance data in real-time. For the UA-DETRAC test set trained with UA-DETRAC trainval set, DP-SSD with the input size of 300 × 300 achieves 75.43% mAP (mean average precision) at the speed of 50.47 FPS (frames per second), and the framework with a 512 × 512 sized input reaches 77.94% mAP at 25.12 FPS using an NVIDIA GeForce GTX 1080Ti GPU. The DP-SSD shows comparable accuracy, which is better than those of the compared state-of-the-art models, except for YOLOv3. MDPI 2019-01-30 /pmc/articles/PMC6387095/ /pubmed/30704152 http://dx.doi.org/10.3390/s19030594 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Zhang, Fukai Li, Ce Yang, Feng Vehicle Detection in Urban Traffic Surveillance Images Based on Convolutional Neural Networks with Feature Concatenation |
title | Vehicle Detection in Urban Traffic Surveillance Images Based on Convolutional Neural Networks with Feature Concatenation |
title_full | Vehicle Detection in Urban Traffic Surveillance Images Based on Convolutional Neural Networks with Feature Concatenation |
title_fullStr | Vehicle Detection in Urban Traffic Surveillance Images Based on Convolutional Neural Networks with Feature Concatenation |
title_full_unstemmed | Vehicle Detection in Urban Traffic Surveillance Images Based on Convolutional Neural Networks with Feature Concatenation |
title_short | Vehicle Detection in Urban Traffic Surveillance Images Based on Convolutional Neural Networks with Feature Concatenation |
title_sort | vehicle detection in urban traffic surveillance images based on convolutional neural networks with feature concatenation |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6387095/ https://www.ncbi.nlm.nih.gov/pubmed/30704152 http://dx.doi.org/10.3390/s19030594 |
work_keys_str_mv | AT zhangfukai vehicledetectioninurbantrafficsurveillanceimagesbasedonconvolutionalneuralnetworkswithfeatureconcatenation AT lice vehicledetectioninurbantrafficsurveillanceimagesbasedonconvolutionalneuralnetworkswithfeatureconcatenation AT yangfeng vehicledetectioninurbantrafficsurveillanceimagesbasedonconvolutionalneuralnetworkswithfeatureconcatenation |