Cargando…

Deep Learning-Based Monocular 3D Object Detection with Refinement of Depth Information

Recently, the research on monocular 3D target detection based on pseudo-LiDAR data has made some progress. In contrast to LiDAR-based algorithms, the robustness of pseudo-LiDAR methods is still inferior. After conducting in-depth experiments, we realized that the main limitations are due to the inac...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hu, Henan, Zhu, Ming, Li, Muyu, Chan, Kwok-Leung
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9003335/ https://www.ncbi.nlm.nih.gov/pubmed/35408191 http://dx.doi.org/10.3390/s22072576

_version_	1784686109469442048
author	Hu, Henan Zhu, Ming Li, Muyu Chan, Kwok-Leung
author_facet	Hu, Henan Zhu, Ming Li, Muyu Chan, Kwok-Leung
author_sort	Hu, Henan
collection	PubMed
description	Recently, the research on monocular 3D target detection based on pseudo-LiDAR data has made some progress. In contrast to LiDAR-based algorithms, the robustness of pseudo-LiDAR methods is still inferior. After conducting in-depth experiments, we realized that the main limitations are due to the inaccuracy of the target position and the uncertainty in the depth distribution of the foreground target. These two problems arise from the inaccurate depth estimation. To deal with the aforementioned problems, we propose two innovative solutions. The first is a novel method based on joint image segmentation and geometric constraints, used to predict the target depth and provide the depth prediction confidence measure. The predicted target depth is fused with the overall depth of the scene and results in the optimal target position. For the second, we utilize the target scale, normalized with the Gaussian function, as a priori information. The uncertainty of depth distribution, which can be visualized as long-tail noise, is reduced. With the refined depth information, we convert the optimized depth map into the point cloud representation, called a pseudo-LiDAR point cloud. Finally, we input the pseudo-LiDAR point cloud to the LiDAR-based algorithm to detect the 3D target. We conducted extensive experiments on the challenging KITTI dataset. The results demonstrate that our proposed framework outperforms various state-of-the-art methods by more than 12.37% and 5.34% on the easy and hard settings of the KITTI validation subset, respectively. On the KITTI test set, our framework also outperformed state-of-the-art methods by 5.1% and 1.76% on the easy and hard settings, respectively.
format	Online Article Text
id	pubmed-9003335
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-90033352022-04-13 Deep Learning-Based Monocular 3D Object Detection with Refinement of Depth Information Hu, Henan Zhu, Ming Li, Muyu Chan, Kwok-Leung Sensors (Basel) Article Recently, the research on monocular 3D target detection based on pseudo-LiDAR data has made some progress. In contrast to LiDAR-based algorithms, the robustness of pseudo-LiDAR methods is still inferior. After conducting in-depth experiments, we realized that the main limitations are due to the inaccuracy of the target position and the uncertainty in the depth distribution of the foreground target. These two problems arise from the inaccurate depth estimation. To deal with the aforementioned problems, we propose two innovative solutions. The first is a novel method based on joint image segmentation and geometric constraints, used to predict the target depth and provide the depth prediction confidence measure. The predicted target depth is fused with the overall depth of the scene and results in the optimal target position. For the second, we utilize the target scale, normalized with the Gaussian function, as a priori information. The uncertainty of depth distribution, which can be visualized as long-tail noise, is reduced. With the refined depth information, we convert the optimized depth map into the point cloud representation, called a pseudo-LiDAR point cloud. Finally, we input the pseudo-LiDAR point cloud to the LiDAR-based algorithm to detect the 3D target. We conducted extensive experiments on the challenging KITTI dataset. The results demonstrate that our proposed framework outperforms various state-of-the-art methods by more than 12.37% and 5.34% on the easy and hard settings of the KITTI validation subset, respectively. On the KITTI test set, our framework also outperformed state-of-the-art methods by 5.1% and 1.76% on the easy and hard settings, respectively. MDPI 2022-03-28 /pmc/articles/PMC9003335/ /pubmed/35408191 http://dx.doi.org/10.3390/s22072576 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Hu, Henan Zhu, Ming Li, Muyu Chan, Kwok-Leung Deep Learning-Based Monocular 3D Object Detection with Refinement of Depth Information
title	Deep Learning-Based Monocular 3D Object Detection with Refinement of Depth Information
title_full	Deep Learning-Based Monocular 3D Object Detection with Refinement of Depth Information
title_fullStr	Deep Learning-Based Monocular 3D Object Detection with Refinement of Depth Information
title_full_unstemmed	Deep Learning-Based Monocular 3D Object Detection with Refinement of Depth Information
title_short	Deep Learning-Based Monocular 3D Object Detection with Refinement of Depth Information
title_sort	deep learning-based monocular 3d object detection with refinement of depth information
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9003335/ https://www.ncbi.nlm.nih.gov/pubmed/35408191 http://dx.doi.org/10.3390/s22072576
work_keys_str_mv	AT huhenan deeplearningbasedmonocular3dobjectdetectionwithrefinementofdepthinformation AT zhuming deeplearningbasedmonocular3dobjectdetectionwithrefinementofdepthinformation AT limuyu deeplearningbasedmonocular3dobjectdetectionwithrefinementofdepthinformation AT chankwokleung deeplearningbasedmonocular3dobjectdetectionwithrefinementofdepthinformation

Deep Learning-Based Monocular 3D Object Detection with Refinement of Depth Information

Ejemplares similares