Cargando…

eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection

Many alternative approaches for 3D object detection using a singular camera have been studied instead of leveraging high-precision 3D LiDAR sensors incurring a prohibitive cost. Recently, we proposed a novel approach for 3D object detection by employing a ground plane model that utilizes geometric c...

Descripción completa

Detalles Bibliográficos
Autores principales: Ngo, Duc Tuan, Bui, Minh-Quan Viet, Nguyen, Duc Dung, Pham, Hoang-Anh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9680869/
https://www.ncbi.nlm.nih.gov/pubmed/36426241
http://dx.doi.org/10.7717/peerj-cs.1144
_version_ 1784834499570302976
author Ngo, Duc Tuan
Bui, Minh-Quan Viet
Nguyen, Duc Dung
Pham, Hoang-Anh
author_facet Ngo, Duc Tuan
Bui, Minh-Quan Viet
Nguyen, Duc Dung
Pham, Hoang-Anh
author_sort Ngo, Duc Tuan
collection PubMed
description Many alternative approaches for 3D object detection using a singular camera have been studied instead of leveraging high-precision 3D LiDAR sensors incurring a prohibitive cost. Recently, we proposed a novel approach for 3D object detection by employing a ground plane model that utilizes geometric constraints named GAC3D to improve the results of the deep-based detector. GAC3D adopts an adaptive depth convolution to replace the traditional 2D convolution to deal with the divergent context of the image’s feature, leading to a significant improvement in both training convergence and testing accuracy on the KITTI 3D object detection benchmark. This article presents an alternative architecture named eGAC3D that adopts a revised depth adaptive convolution with variant guidance to improve detection accuracy. Additionally, eGAC3D utilizes the pixel adaptive convolution to leverage the depth map to guide our model for detection heads instead of using an external depth estimator like other methods leading to a significant reduction of time inference. The experimental results on the KITTI benchmark show that our eGAC3D outperforms not only our previous GAC3D but also many existing monocular methods in terms of accuracy and inference time. Moreover, we deployed and optimized the proposed eGAC3D framework on an embedded platform with a low-cost GPU. To the best of the authors’ knowledge, we are the first to develop a monocular 3D detection framework on embedded devices. The experimental results on Jetson Xavier NX demonstrate that our proposed method can achieve nearly real-time performance with appropriate accuracy even with the modest hardware resource.
format Online
Article
Text
id pubmed-9680869
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-96808692022-11-23 eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection Ngo, Duc Tuan Bui, Minh-Quan Viet Nguyen, Duc Dung Pham, Hoang-Anh PeerJ Comput Sci Artificial Intelligence Many alternative approaches for 3D object detection using a singular camera have been studied instead of leveraging high-precision 3D LiDAR sensors incurring a prohibitive cost. Recently, we proposed a novel approach for 3D object detection by employing a ground plane model that utilizes geometric constraints named GAC3D to improve the results of the deep-based detector. GAC3D adopts an adaptive depth convolution to replace the traditional 2D convolution to deal with the divergent context of the image’s feature, leading to a significant improvement in both training convergence and testing accuracy on the KITTI 3D object detection benchmark. This article presents an alternative architecture named eGAC3D that adopts a revised depth adaptive convolution with variant guidance to improve detection accuracy. Additionally, eGAC3D utilizes the pixel adaptive convolution to leverage the depth map to guide our model for detection heads instead of using an external depth estimator like other methods leading to a significant reduction of time inference. The experimental results on the KITTI benchmark show that our eGAC3D outperforms not only our previous GAC3D but also many existing monocular methods in terms of accuracy and inference time. Moreover, we deployed and optimized the proposed eGAC3D framework on an embedded platform with a low-cost GPU. To the best of the authors’ knowledge, we are the first to develop a monocular 3D detection framework on embedded devices. The experimental results on Jetson Xavier NX demonstrate that our proposed method can achieve nearly real-time performance with appropriate accuracy even with the modest hardware resource. PeerJ Inc. 2022-11-03 /pmc/articles/PMC9680869/ /pubmed/36426241 http://dx.doi.org/10.7717/peerj-cs.1144 Text en © 2022 Ngo et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Artificial Intelligence
Ngo, Duc Tuan
Bui, Minh-Quan Viet
Nguyen, Duc Dung
Pham, Hoang-Anh
eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection
title eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection
title_full eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection
title_fullStr eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection
title_full_unstemmed eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection
title_short eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection
title_sort egac3d: enhancing depth adaptive convolution and depth estimation for monocular 3d object pose detection
topic Artificial Intelligence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9680869/
https://www.ncbi.nlm.nih.gov/pubmed/36426241
http://dx.doi.org/10.7717/peerj-cs.1144
work_keys_str_mv AT ngoductuan egac3denhancingdepthadaptiveconvolutionanddepthestimationformonocular3dobjectposedetection
AT buiminhquanviet egac3denhancingdepthadaptiveconvolutionanddepthestimationformonocular3dobjectposedetection
AT nguyenducdung egac3denhancingdepthadaptiveconvolutionanddepthestimationformonocular3dobjectposedetection
AT phamhoanganh egac3denhancingdepthadaptiveconvolutionanddepthestimationformonocular3dobjectposedetection