Cargando…

PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention

Designing efficient deep learning models for 3D point cloud perception is becoming a major research direction. Point-voxel convolution (PVConv) Liu et al. (2019) is a pioneering research work in this topic. However, since with quite a few layers of simple 3D convolutions and linear point-voxel featu...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Yuhong, Peng, Weilong, Tang, Keke, Khan, Asad, Wei, Guodong, Fang, Meie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9122688/
https://www.ncbi.nlm.nih.gov/pubmed/35602612
http://dx.doi.org/10.1155/2022/2286818
_version_ 1784711397545869312
author Chen, Yuhong
Peng, Weilong
Tang, Keke
Khan, Asad
Wei, Guodong
Fang, Meie
author_facet Chen, Yuhong
Peng, Weilong
Tang, Keke
Khan, Asad
Wei, Guodong
Fang, Meie
author_sort Chen, Yuhong
collection PubMed
description Designing efficient deep learning models for 3D point cloud perception is becoming a major research direction. Point-voxel convolution (PVConv) Liu et al. (2019) is a pioneering research work in this topic. However, since with quite a few layers of simple 3D convolutions and linear point-voxel feature fusion operations, it still has considerable room for improvement in performance. In this paper, we propose a novel pyramid point-voxel convolution (PyraPVConv) block with two key structural modifications to address the above issues. First, PyraPVConv uses a voxel pyramid module to fully extract voxel features in the manner of feature pyramid, such that sufficient voxel features can be obtained efficiently. Second, a sharable attention module is utilized to capture compatible features between multi-scale voxels in pyramid and point cloud for aggregation, as well as to reduce the complexity via structure sharing. Extensive results on three point cloud perception tasks, i.e., indoor scene segmentation, object part segmentation and 3D object detection, validate that the networks constructed by stacking PyraPVConv blocks are efficient in terms of both GPU memory consumption and computational complexity, and are superior to the state-of-the-art methods.
format Online
Article
Text
id pubmed-9122688
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-91226882022-05-21 PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention Chen, Yuhong Peng, Weilong Tang, Keke Khan, Asad Wei, Guodong Fang, Meie Comput Intell Neurosci Research Article Designing efficient deep learning models for 3D point cloud perception is becoming a major research direction. Point-voxel convolution (PVConv) Liu et al. (2019) is a pioneering research work in this topic. However, since with quite a few layers of simple 3D convolutions and linear point-voxel feature fusion operations, it still has considerable room for improvement in performance. In this paper, we propose a novel pyramid point-voxel convolution (PyraPVConv) block with two key structural modifications to address the above issues. First, PyraPVConv uses a voxel pyramid module to fully extract voxel features in the manner of feature pyramid, such that sufficient voxel features can be obtained efficiently. Second, a sharable attention module is utilized to capture compatible features between multi-scale voxels in pyramid and point cloud for aggregation, as well as to reduce the complexity via structure sharing. Extensive results on three point cloud perception tasks, i.e., indoor scene segmentation, object part segmentation and 3D object detection, validate that the networks constructed by stacking PyraPVConv blocks are efficient in terms of both GPU memory consumption and computational complexity, and are superior to the state-of-the-art methods. Hindawi 2022-05-13 /pmc/articles/PMC9122688/ /pubmed/35602612 http://dx.doi.org/10.1155/2022/2286818 Text en Copyright © 2022 Yuhong Chen et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Chen, Yuhong
Peng, Weilong
Tang, Keke
Khan, Asad
Wei, Guodong
Fang, Meie
PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention
title PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention
title_full PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention
title_fullStr PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention
title_full_unstemmed PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention
title_short PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention
title_sort pyrapvconv: efficient 3d point cloud perception with pyramid voxel convolution and sharable attention
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9122688/
https://www.ncbi.nlm.nih.gov/pubmed/35602612
http://dx.doi.org/10.1155/2022/2286818
work_keys_str_mv AT chenyuhong pyrapvconvefficient3dpointcloudperceptionwithpyramidvoxelconvolutionandsharableattention
AT pengweilong pyrapvconvefficient3dpointcloudperceptionwithpyramidvoxelconvolutionandsharableattention
AT tangkeke pyrapvconvefficient3dpointcloudperceptionwithpyramidvoxelconvolutionandsharableattention
AT khanasad pyrapvconvefficient3dpointcloudperceptionwithpyramidvoxelconvolutionandsharableattention
AT weiguodong pyrapvconvefficient3dpointcloudperceptionwithpyramidvoxelconvolutionandsharableattention
AT fangmeie pyrapvconvefficient3dpointcloudperceptionwithpyramidvoxelconvolutionandsharableattention