Cargando…

eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection

Many alternative approaches for 3D object detection using a singular camera have been studied instead of leveraging high-precision 3D LiDAR sensors incurring a prohibitive cost. Recently, we proposed a novel approach for 3D object detection by employing a ground plane model that utilizes geometric c...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ngo, Duc Tuan, Bui, Minh-Quan Viet, Nguyen, Duc Dung, Pham, Hoang-Anh
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2022
Materias:	Artificial Intelligence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9680869/ https://www.ncbi.nlm.nih.gov/pubmed/36426241 http://dx.doi.org/10.7717/peerj-cs.1144

_version_	1784834499570302976
author	Ngo, Duc Tuan Bui, Minh-Quan Viet Nguyen, Duc Dung Pham, Hoang-Anh
author_facet	Ngo, Duc Tuan Bui, Minh-Quan Viet Nguyen, Duc Dung Pham, Hoang-Anh
author_sort	Ngo, Duc Tuan
collection	PubMed
description	Many alternative approaches for 3D object detection using a singular camera have been studied instead of leveraging high-precision 3D LiDAR sensors incurring a prohibitive cost. Recently, we proposed a novel approach for 3D object detection by employing a ground plane model that utilizes geometric constraints named GAC3D to improve the results of the deep-based detector. GAC3D adopts an adaptive depth convolution to replace the traditional 2D convolution to deal with the divergent context of the image’s feature, leading to a significant improvement in both training convergence and testing accuracy on the KITTI 3D object detection benchmark. This article presents an alternative architecture named eGAC3D that adopts a revised depth adaptive convolution with variant guidance to improve detection accuracy. Additionally, eGAC3D utilizes the pixel adaptive convolution to leverage the depth map to guide our model for detection heads instead of using an external depth estimator like other methods leading to a significant reduction of time inference. The experimental results on the KITTI benchmark show that our eGAC3D outperforms not only our previous GAC3D but also many existing monocular methods in terms of accuracy and inference time. Moreover, we deployed and optimized the proposed eGAC3D framework on an embedded platform with a low-cost GPU. To the best of the authors’ knowledge, we are the first to develop a monocular 3D detection framework on embedded devices. The experimental results on Jetson Xavier NX demonstrate that our proposed method can achieve nearly real-time performance with appropriate accuracy even with the modest hardware resource.
format	Online Article Text
id	pubmed-9680869
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-96808692022-11-23 eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection Ngo, Duc Tuan Bui, Minh-Quan Viet Nguyen, Duc Dung Pham, Hoang-Anh PeerJ Comput Sci Artificial Intelligence Many alternative approaches for 3D object detection using a singular camera have been studied instead of leveraging high-precision 3D LiDAR sensors incurring a prohibitive cost. Recently, we proposed a novel approach for 3D object detection by employing a ground plane model that utilizes geometric constraints named GAC3D to improve the results of the deep-based detector. GAC3D adopts an adaptive depth convolution to replace the traditional 2D convolution to deal with the divergent context of the image’s feature, leading to a significant improvement in both training convergence and testing accuracy on the KITTI 3D object detection benchmark. This article presents an alternative architecture named eGAC3D that adopts a revised depth adaptive convolution with variant guidance to improve detection accuracy. Additionally, eGAC3D utilizes the pixel adaptive convolution to leverage the depth map to guide our model for detection heads instead of using an external depth estimator like other methods leading to a significant reduction of time inference. The experimental results on the KITTI benchmark show that our eGAC3D outperforms not only our previous GAC3D but also many existing monocular methods in terms of accuracy and inference time. Moreover, we deployed and optimized the proposed eGAC3D framework on an embedded platform with a low-cost GPU. To the best of the authors’ knowledge, we are the first to develop a monocular 3D detection framework on embedded devices. The experimental results on Jetson Xavier NX demonstrate that our proposed method can achieve nearly real-time performance with appropriate accuracy even with the modest hardware resource. PeerJ Inc. 2022-11-03 /pmc/articles/PMC9680869/ /pubmed/36426241 http://dx.doi.org/10.7717/peerj-cs.1144 Text en © 2022 Ngo et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle	Artificial Intelligence Ngo, Duc Tuan Bui, Minh-Quan Viet Nguyen, Duc Dung Pham, Hoang-Anh eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection
title	eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection
title_full	eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection
title_fullStr	eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection
title_full_unstemmed	eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection
title_short	eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection
title_sort	egac3d: enhancing depth adaptive convolution and depth estimation for monocular 3d object pose detection
topic	Artificial Intelligence
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9680869/ https://www.ncbi.nlm.nih.gov/pubmed/36426241 http://dx.doi.org/10.7717/peerj-cs.1144
work_keys_str_mv	AT ngoductuan egac3denhancingdepthadaptiveconvolutionanddepthestimationformonocular3dobjectposedetection AT buiminhquanviet egac3denhancingdepthadaptiveconvolutionanddepthestimationformonocular3dobjectposedetection AT nguyenducdung egac3denhancingdepthadaptiveconvolutionanddepthestimationformonocular3dobjectposedetection AT phamhoanganh egac3denhancingdepthadaptiveconvolutionanddepthestimationformonocular3dobjectposedetection

eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection

Ejemplares similares