Cargando…

Attention Fusion for One-Stage Multispectral Pedestrian Detection

Multispectral pedestrian detection, which consists of a color stream and thermal stream, is essential under conditions of insufficient illumination because the fusion of the two streams can provide complementary information for detecting pedestrians based on deep convolutional neural networks (CNNs)...

Descripción completa

Detalles Bibliográficos
Autores principales: Cao, Zhiwei, Yang, Huihua, Zhao, Juan, Guo, Shuhong, Li, Lingqiao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8235776/
https://www.ncbi.nlm.nih.gov/pubmed/34207183
http://dx.doi.org/10.3390/s21124184
_version_ 1783714397881892864
author Cao, Zhiwei
Yang, Huihua
Zhao, Juan
Guo, Shuhong
Li, Lingqiao
author_facet Cao, Zhiwei
Yang, Huihua
Zhao, Juan
Guo, Shuhong
Li, Lingqiao
author_sort Cao, Zhiwei
collection PubMed
description Multispectral pedestrian detection, which consists of a color stream and thermal stream, is essential under conditions of insufficient illumination because the fusion of the two streams can provide complementary information for detecting pedestrians based on deep convolutional neural networks (CNNs). In this paper, we introduced and adapted a simple and efficient one-stage YOLOv4 to replace the current state-of-the-art two-stage fast-RCNN for multispectral pedestrian detection and to directly predict bounding boxes with confidence scores. To further improve the detection performance, we analyzed the existing multispectral fusion methods and proposed a novel multispectral channel feature fusion (MCFF) module for integrating the features from the color and thermal streams according to the illumination conditions. Moreover, several fusion architectures, such as Early Fusion, Halfway Fusion, Late Fusion, and Direct Fusion, were carefully designed based on the MCFF to transfer the feature information from the bottom to the top at different stages. Finally, the experimental results on the KAIST and Utokyo pedestrian benchmarks showed that Halfway Fusion was used to obtain the best performance of all architectures and the MCFF could adapt fused features in the two modalities. The log-average miss rate (MR) for the two modalities with reasonable settings were 4.91% and 23.14%, respectively.
format Online
Article
Text
id pubmed-8235776
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-82357762021-06-27 Attention Fusion for One-Stage Multispectral Pedestrian Detection Cao, Zhiwei Yang, Huihua Zhao, Juan Guo, Shuhong Li, Lingqiao Sensors (Basel) Article Multispectral pedestrian detection, which consists of a color stream and thermal stream, is essential under conditions of insufficient illumination because the fusion of the two streams can provide complementary information for detecting pedestrians based on deep convolutional neural networks (CNNs). In this paper, we introduced and adapted a simple and efficient one-stage YOLOv4 to replace the current state-of-the-art two-stage fast-RCNN for multispectral pedestrian detection and to directly predict bounding boxes with confidence scores. To further improve the detection performance, we analyzed the existing multispectral fusion methods and proposed a novel multispectral channel feature fusion (MCFF) module for integrating the features from the color and thermal streams according to the illumination conditions. Moreover, several fusion architectures, such as Early Fusion, Halfway Fusion, Late Fusion, and Direct Fusion, were carefully designed based on the MCFF to transfer the feature information from the bottom to the top at different stages. Finally, the experimental results on the KAIST and Utokyo pedestrian benchmarks showed that Halfway Fusion was used to obtain the best performance of all architectures and the MCFF could adapt fused features in the two modalities. The log-average miss rate (MR) for the two modalities with reasonable settings were 4.91% and 23.14%, respectively. MDPI 2021-06-18 /pmc/articles/PMC8235776/ /pubmed/34207183 http://dx.doi.org/10.3390/s21124184 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Cao, Zhiwei
Yang, Huihua
Zhao, Juan
Guo, Shuhong
Li, Lingqiao
Attention Fusion for One-Stage Multispectral Pedestrian Detection
title Attention Fusion for One-Stage Multispectral Pedestrian Detection
title_full Attention Fusion for One-Stage Multispectral Pedestrian Detection
title_fullStr Attention Fusion for One-Stage Multispectral Pedestrian Detection
title_full_unstemmed Attention Fusion for One-Stage Multispectral Pedestrian Detection
title_short Attention Fusion for One-Stage Multispectral Pedestrian Detection
title_sort attention fusion for one-stage multispectral pedestrian detection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8235776/
https://www.ncbi.nlm.nih.gov/pubmed/34207183
http://dx.doi.org/10.3390/s21124184
work_keys_str_mv AT caozhiwei attentionfusionforonestagemultispectralpedestriandetection
AT yanghuihua attentionfusionforonestagemultispectralpedestriandetection
AT zhaojuan attentionfusionforonestagemultispectralpedestriandetection
AT guoshuhong attentionfusionforonestagemultispectralpedestriandetection
AT lilingqiao attentionfusionforonestagemultispectralpedestriandetection