Cargando…

FCNet: Stereo 3D Object Detection with Feature Correlation Networks

Deep-learning techniques have significantly improved object detection performance, especially with binocular images in 3D scenarios. To supervise the depth information in stereo 3D object detection, reconstructing the 3D dense depth of LiDAR point clouds causes higher computational costs and lower i...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Yingyu, Liu, Ziyan, Chen, Yunlei, Zheng, Xuhui, Zhang, Qian, Yang, Mo, Tang, Guangming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9407267/
https://www.ncbi.nlm.nih.gov/pubmed/36010784
http://dx.doi.org/10.3390/e24081121
_version_ 1784774323119063040
author Wu, Yingyu
Liu, Ziyan
Chen, Yunlei
Zheng, Xuhui
Zhang, Qian
Yang, Mo
Tang, Guangming
author_facet Wu, Yingyu
Liu, Ziyan
Chen, Yunlei
Zheng, Xuhui
Zhang, Qian
Yang, Mo
Tang, Guangming
author_sort Wu, Yingyu
collection PubMed
description Deep-learning techniques have significantly improved object detection performance, especially with binocular images in 3D scenarios. To supervise the depth information in stereo 3D object detection, reconstructing the 3D dense depth of LiDAR point clouds causes higher computational costs and lower inference speed. After exploring the intrinsic relationship between the implicit depth information and semantic texture features of the binocular images, we propose an efficient and accurate 3D object detection algorithm, FCNet, in stereo images. First, we construct a multi-scale cost–volume containing implicit depth information using the normalized dot-product by generating multi-scale feature maps from the input stereo images. Secondly, the variant attention model enhances its global and local description, and the sparse region monitors the depth loss deep regression. Thirdly, for balancing the channel information preservation of the re-fused left–right feature maps and computational burden, a reweighting strategy is employed to enhance the feature correlation in merging the last-layer features of binocular images. Extensive experiment results on the challenging KITTI benchmark demonstrate that the proposed algorithm achieves better performance, including a lower computational cost and higher inference speed in 3D object detection.
format Online
Article
Text
id pubmed-9407267
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-94072672022-08-26 FCNet: Stereo 3D Object Detection with Feature Correlation Networks Wu, Yingyu Liu, Ziyan Chen, Yunlei Zheng, Xuhui Zhang, Qian Yang, Mo Tang, Guangming Entropy (Basel) Article Deep-learning techniques have significantly improved object detection performance, especially with binocular images in 3D scenarios. To supervise the depth information in stereo 3D object detection, reconstructing the 3D dense depth of LiDAR point clouds causes higher computational costs and lower inference speed. After exploring the intrinsic relationship between the implicit depth information and semantic texture features of the binocular images, we propose an efficient and accurate 3D object detection algorithm, FCNet, in stereo images. First, we construct a multi-scale cost–volume containing implicit depth information using the normalized dot-product by generating multi-scale feature maps from the input stereo images. Secondly, the variant attention model enhances its global and local description, and the sparse region monitors the depth loss deep regression. Thirdly, for balancing the channel information preservation of the re-fused left–right feature maps and computational burden, a reweighting strategy is employed to enhance the feature correlation in merging the last-layer features of binocular images. Extensive experiment results on the challenging KITTI benchmark demonstrate that the proposed algorithm achieves better performance, including a lower computational cost and higher inference speed in 3D object detection. MDPI 2022-08-14 /pmc/articles/PMC9407267/ /pubmed/36010784 http://dx.doi.org/10.3390/e24081121 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wu, Yingyu
Liu, Ziyan
Chen, Yunlei
Zheng, Xuhui
Zhang, Qian
Yang, Mo
Tang, Guangming
FCNet: Stereo 3D Object Detection with Feature Correlation Networks
title FCNet: Stereo 3D Object Detection with Feature Correlation Networks
title_full FCNet: Stereo 3D Object Detection with Feature Correlation Networks
title_fullStr FCNet: Stereo 3D Object Detection with Feature Correlation Networks
title_full_unstemmed FCNet: Stereo 3D Object Detection with Feature Correlation Networks
title_short FCNet: Stereo 3D Object Detection with Feature Correlation Networks
title_sort fcnet: stereo 3d object detection with feature correlation networks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9407267/
https://www.ncbi.nlm.nih.gov/pubmed/36010784
http://dx.doi.org/10.3390/e24081121
work_keys_str_mv AT wuyingyu fcnetstereo3dobjectdetectionwithfeaturecorrelationnetworks
AT liuziyan fcnetstereo3dobjectdetectionwithfeaturecorrelationnetworks
AT chenyunlei fcnetstereo3dobjectdetectionwithfeaturecorrelationnetworks
AT zhengxuhui fcnetstereo3dobjectdetectionwithfeaturecorrelationnetworks
AT zhangqian fcnetstereo3dobjectdetectionwithfeaturecorrelationnetworks
AT yangmo fcnetstereo3dobjectdetectionwithfeaturecorrelationnetworks
AT tangguangming fcnetstereo3dobjectdetectionwithfeaturecorrelationnetworks