Cargando…

GAC3D: improving monocular 3D object detection with ground-guide model and adaptive convolution

Monocular 3D object detection has recently become prevalent in autonomous driving and navigation applications due to its cost-efficiency and easy-to-embed to existent vehicles. The most challenging task in monocular vision is to estimate a reliable object’s location cause of the lack of depth inform...

Descripción completa

Detalles Bibliográficos
Autores principales: Bui, Minh-Quan Viet, Ngo, Duc Tuan, Pham, Hoang-Anh, Nguyen, Duc Dung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8507478/
https://www.ncbi.nlm.nih.gov/pubmed/34712790
http://dx.doi.org/10.7717/peerj-cs.686
_version_ 1784581865175252992
author Bui, Minh-Quan Viet
Ngo, Duc Tuan
Pham, Hoang-Anh
Nguyen, Duc Dung
author_facet Bui, Minh-Quan Viet
Ngo, Duc Tuan
Pham, Hoang-Anh
Nguyen, Duc Dung
author_sort Bui, Minh-Quan Viet
collection PubMed
description Monocular 3D object detection has recently become prevalent in autonomous driving and navigation applications due to its cost-efficiency and easy-to-embed to existent vehicles. The most challenging task in monocular vision is to estimate a reliable object’s location cause of the lack of depth information in RGB images. Many methods tackle this ill-posed problem by directly regressing the object’s depth or take the depth map as a supplement input to enhance the model’s results. However, the performance relies heavily on the estimated depth map quality, which is bias to the training data. In this work, we propose depth-adaptive convolution to replace the traditional 2D convolution to deal with the divergent context of the image’s features. This lead to significant improvement in both training convergence and testing accuracy. Second, we propose a ground plane model that utilizes geometric constraints in the pose estimation process. With the new method, named GAC3D, we achieve better detection results. We demonstrate our approach on the KITTI 3D Object Detection benchmark, which outperforms existing monocular methods.
format Online
Article
Text
id pubmed-8507478
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-85074782021-10-27 GAC3D: improving monocular 3D object detection with ground-guide model and adaptive convolution Bui, Minh-Quan Viet Ngo, Duc Tuan Pham, Hoang-Anh Nguyen, Duc Dung PeerJ Comput Sci Artificial Intelligence Monocular 3D object detection has recently become prevalent in autonomous driving and navigation applications due to its cost-efficiency and easy-to-embed to existent vehicles. The most challenging task in monocular vision is to estimate a reliable object’s location cause of the lack of depth information in RGB images. Many methods tackle this ill-posed problem by directly regressing the object’s depth or take the depth map as a supplement input to enhance the model’s results. However, the performance relies heavily on the estimated depth map quality, which is bias to the training data. In this work, we propose depth-adaptive convolution to replace the traditional 2D convolution to deal with the divergent context of the image’s features. This lead to significant improvement in both training convergence and testing accuracy. Second, we propose a ground plane model that utilizes geometric constraints in the pose estimation process. With the new method, named GAC3D, we achieve better detection results. We demonstrate our approach on the KITTI 3D Object Detection benchmark, which outperforms existing monocular methods. PeerJ Inc. 2021-10-06 /pmc/articles/PMC8507478/ /pubmed/34712790 http://dx.doi.org/10.7717/peerj-cs.686 Text en © 2021 Bui et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Artificial Intelligence
Bui, Minh-Quan Viet
Ngo, Duc Tuan
Pham, Hoang-Anh
Nguyen, Duc Dung
GAC3D: improving monocular 3D object detection with ground-guide model and adaptive convolution
title GAC3D: improving monocular 3D object detection with ground-guide model and adaptive convolution
title_full GAC3D: improving monocular 3D object detection with ground-guide model and adaptive convolution
title_fullStr GAC3D: improving monocular 3D object detection with ground-guide model and adaptive convolution
title_full_unstemmed GAC3D: improving monocular 3D object detection with ground-guide model and adaptive convolution
title_short GAC3D: improving monocular 3D object detection with ground-guide model and adaptive convolution
title_sort gac3d: improving monocular 3d object detection with ground-guide model and adaptive convolution
topic Artificial Intelligence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8507478/
https://www.ncbi.nlm.nih.gov/pubmed/34712790
http://dx.doi.org/10.7717/peerj-cs.686
work_keys_str_mv AT buiminhquanviet gac3dimprovingmonocular3dobjectdetectionwithgroundguidemodelandadaptiveconvolution
AT ngoductuan gac3dimprovingmonocular3dobjectdetectionwithgroundguidemodelandadaptiveconvolution
AT phamhoanganh gac3dimprovingmonocular3dobjectdetectionwithgroundguidemodelandadaptiveconvolution
AT nguyenducdung gac3dimprovingmonocular3dobjectdetectionwithgroundguidemodelandadaptiveconvolution