Cargando…

Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection

The Voxel Transformer (VoTr) is a prominent model in the field of 3D object detection, employing a transformer-based architecture to comprehend long-range voxel relationships through self-attention. However, despite its expanded receptive field, VoTr’s flexibility is constrained by its predefined re...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Taeho, Kim, Joohee
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10458770/
https://www.ncbi.nlm.nih.gov/pubmed/37631754
http://dx.doi.org/10.3390/s23167217
_version_ 1785097245794762752
author Kim, Taeho
Kim, Joohee
author_facet Kim, Taeho
Kim, Joohee
author_sort Kim, Taeho
collection PubMed
description The Voxel Transformer (VoTr) is a prominent model in the field of 3D object detection, employing a transformer-based architecture to comprehend long-range voxel relationships through self-attention. However, despite its expanded receptive field, VoTr’s flexibility is constrained by its predefined receptive field. In this paper, we present a Voxel Transformer with Density-Aware Deformable Attention (VoTr-DADA), a novel approach to 3D object detection. VoTr-DADA leverages density-guided deformable attention for a more adaptable receptive field. It efficiently identifies key areas in the input using density features, combining the strengths of both VoTr and Deformable Attention. We introduce the Density-Aware Deformable Attention (DADA) module, which is specifically designed to focus on these crucial areas while adaptively extracting more informative features. Experimental results on the KITTI dataset and the Waymo Open dataset show that our proposed method outperforms the baseline VoTr model in 3D object detection while maintaining a fast inference speed.
format Online
Article
Text
id pubmed-10458770
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-104587702023-08-27 Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection Kim, Taeho Kim, Joohee Sensors (Basel) Article The Voxel Transformer (VoTr) is a prominent model in the field of 3D object detection, employing a transformer-based architecture to comprehend long-range voxel relationships through self-attention. However, despite its expanded receptive field, VoTr’s flexibility is constrained by its predefined receptive field. In this paper, we present a Voxel Transformer with Density-Aware Deformable Attention (VoTr-DADA), a novel approach to 3D object detection. VoTr-DADA leverages density-guided deformable attention for a more adaptable receptive field. It efficiently identifies key areas in the input using density features, combining the strengths of both VoTr and Deformable Attention. We introduce the Density-Aware Deformable Attention (DADA) module, which is specifically designed to focus on these crucial areas while adaptively extracting more informative features. Experimental results on the KITTI dataset and the Waymo Open dataset show that our proposed method outperforms the baseline VoTr model in 3D object detection while maintaining a fast inference speed. MDPI 2023-08-17 /pmc/articles/PMC10458770/ /pubmed/37631754 http://dx.doi.org/10.3390/s23167217 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Kim, Taeho
Kim, Joohee
Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection
title Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection
title_full Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection
title_fullStr Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection
title_full_unstemmed Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection
title_short Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection
title_sort voxel transformer with density-aware deformable attention for 3d object detection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10458770/
https://www.ncbi.nlm.nih.gov/pubmed/37631754
http://dx.doi.org/10.3390/s23167217
work_keys_str_mv AT kimtaeho voxeltransformerwithdensityawaredeformableattentionfor3dobjectdetection
AT kimjoohee voxeltransformerwithdensityawaredeformableattentionfor3dobjectdetection