Cargando…

MonoDCN: Monocular 3D object detection based on dynamic convolution

3D object detection is vital in the environment perception of autonomous driving. The current monocular 3D object detection technology mainly uses RGB images and pseudo radar point clouds as input. The methods of taking RGB images as input need to learn with geometric constraints and ignore the dept...

Descripción completa

Detalles Bibliográficos
Autores principales:	Qu, Shenming, Yang, Xinyu, Gao, Yiming, Liang, Shengbin
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2022
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9531824/ https://www.ncbi.nlm.nih.gov/pubmed/36194608 http://dx.doi.org/10.1371/journal.pone.0275438

_version_	1784801984070549504
author	Qu, Shenming Yang, Xinyu Gao, Yiming Liang, Shengbin
author_facet	Qu, Shenming Yang, Xinyu Gao, Yiming Liang, Shengbin
author_sort	Qu, Shenming
collection	PubMed
description	3D object detection is vital in the environment perception of autonomous driving. The current monocular 3D object detection technology mainly uses RGB images and pseudo radar point clouds as input. The methods of taking RGB images as input need to learn with geometric constraints and ignore the depth information in the picture, leading to the method being too complicated and inefficient. Although some image-based methods use depth map information for post-calibration and correction, such methods usually require a high-precision depth estimation network. The methods of using the pseudo radar point cloud as input easily introduce noise in the conversion process of depth information to the pseudo radar point cloud, which cause a large deviation in the detection process and ignores semantic information simultaneously. We introduce dynamic convolution guided by the depth map into the feature extraction network, the convolution kernel of dynamic convolution automatically learns from the depth map of the image. It solves the problem that depth information and semantic information cannot be used simultaneously and improves the accuracy of monocular 3D object detection. MonoDCN is able to significantly improve the performance of both monocular 3D object detection and Bird’s Eye View tasks within the KITTI urban autonomous driving dataset.
format	Online Article Text
id	pubmed-9531824
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-95318242022-10-05 MonoDCN: Monocular 3D object detection based on dynamic convolution Qu, Shenming Yang, Xinyu Gao, Yiming Liang, Shengbin PLoS One Research Article 3D object detection is vital in the environment perception of autonomous driving. The current monocular 3D object detection technology mainly uses RGB images and pseudo radar point clouds as input. The methods of taking RGB images as input need to learn with geometric constraints and ignore the depth information in the picture, leading to the method being too complicated and inefficient. Although some image-based methods use depth map information for post-calibration and correction, such methods usually require a high-precision depth estimation network. The methods of using the pseudo radar point cloud as input easily introduce noise in the conversion process of depth information to the pseudo radar point cloud, which cause a large deviation in the detection process and ignores semantic information simultaneously. We introduce dynamic convolution guided by the depth map into the feature extraction network, the convolution kernel of dynamic convolution automatically learns from the depth map of the image. It solves the problem that depth information and semantic information cannot be used simultaneously and improves the accuracy of monocular 3D object detection. MonoDCN is able to significantly improve the performance of both monocular 3D object detection and Bird’s Eye View tasks within the KITTI urban autonomous driving dataset. Public Library of Science 2022-10-04 /pmc/articles/PMC9531824/ /pubmed/36194608 http://dx.doi.org/10.1371/journal.pone.0275438 Text en © 2022 Qu et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Qu, Shenming Yang, Xinyu Gao, Yiming Liang, Shengbin MonoDCN: Monocular 3D object detection based on dynamic convolution
title	MonoDCN: Monocular 3D object detection based on dynamic convolution
title_full	MonoDCN: Monocular 3D object detection based on dynamic convolution
title_fullStr	MonoDCN: Monocular 3D object detection based on dynamic convolution
title_full_unstemmed	MonoDCN: Monocular 3D object detection based on dynamic convolution
title_short	MonoDCN: Monocular 3D object detection based on dynamic convolution
title_sort	monodcn: monocular 3d object detection based on dynamic convolution
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9531824/ https://www.ncbi.nlm.nih.gov/pubmed/36194608 http://dx.doi.org/10.1371/journal.pone.0275438
work_keys_str_mv	AT qushenming monodcnmonocular3dobjectdetectionbasedondynamicconvolution AT yangxinyu monodcnmonocular3dobjectdetectionbasedondynamicconvolution AT gaoyiming monodcnmonocular3dobjectdetectionbasedondynamicconvolution AT liangshengbin monodcnmonocular3dobjectdetectionbasedondynamicconvolution

MonoDCN: Monocular 3D object detection based on dynamic convolution

Ejemplares similares