Cargando…
Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection
The Voxel Transformer (VoTr) is a prominent model in the field of 3D object detection, employing a transformer-based architecture to comprehend long-range voxel relationships through self-attention. However, despite its expanded receptive field, VoTr’s flexibility is constrained by its predefined re...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10458770/ https://www.ncbi.nlm.nih.gov/pubmed/37631754 http://dx.doi.org/10.3390/s23167217 |
_version_ | 1785097245794762752 |
---|---|
author | Kim, Taeho Kim, Joohee |
author_facet | Kim, Taeho Kim, Joohee |
author_sort | Kim, Taeho |
collection | PubMed |
description | The Voxel Transformer (VoTr) is a prominent model in the field of 3D object detection, employing a transformer-based architecture to comprehend long-range voxel relationships through self-attention. However, despite its expanded receptive field, VoTr’s flexibility is constrained by its predefined receptive field. In this paper, we present a Voxel Transformer with Density-Aware Deformable Attention (VoTr-DADA), a novel approach to 3D object detection. VoTr-DADA leverages density-guided deformable attention for a more adaptable receptive field. It efficiently identifies key areas in the input using density features, combining the strengths of both VoTr and Deformable Attention. We introduce the Density-Aware Deformable Attention (DADA) module, which is specifically designed to focus on these crucial areas while adaptively extracting more informative features. Experimental results on the KITTI dataset and the Waymo Open dataset show that our proposed method outperforms the baseline VoTr model in 3D object detection while maintaining a fast inference speed. |
format | Online Article Text |
id | pubmed-10458770 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-104587702023-08-27 Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection Kim, Taeho Kim, Joohee Sensors (Basel) Article The Voxel Transformer (VoTr) is a prominent model in the field of 3D object detection, employing a transformer-based architecture to comprehend long-range voxel relationships through self-attention. However, despite its expanded receptive field, VoTr’s flexibility is constrained by its predefined receptive field. In this paper, we present a Voxel Transformer with Density-Aware Deformable Attention (VoTr-DADA), a novel approach to 3D object detection. VoTr-DADA leverages density-guided deformable attention for a more adaptable receptive field. It efficiently identifies key areas in the input using density features, combining the strengths of both VoTr and Deformable Attention. We introduce the Density-Aware Deformable Attention (DADA) module, which is specifically designed to focus on these crucial areas while adaptively extracting more informative features. Experimental results on the KITTI dataset and the Waymo Open dataset show that our proposed method outperforms the baseline VoTr model in 3D object detection while maintaining a fast inference speed. MDPI 2023-08-17 /pmc/articles/PMC10458770/ /pubmed/37631754 http://dx.doi.org/10.3390/s23167217 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Kim, Taeho Kim, Joohee Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection |
title | Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection |
title_full | Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection |
title_fullStr | Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection |
title_full_unstemmed | Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection |
title_short | Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection |
title_sort | voxel transformer with density-aware deformable attention for 3d object detection |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10458770/ https://www.ncbi.nlm.nih.gov/pubmed/37631754 http://dx.doi.org/10.3390/s23167217 |
work_keys_str_mv | AT kimtaeho voxeltransformerwithdensityawaredeformableattentionfor3dobjectdetection AT kimjoohee voxeltransformerwithdensityawaredeformableattentionfor3dobjectdetection |