Cargando…
PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention
Designing efficient deep learning models for 3D point cloud perception is becoming a major research direction. Point-voxel convolution (PVConv) Liu et al. (2019) is a pioneering research work in this topic. However, since with quite a few layers of simple 3D convolutions and linear point-voxel featu...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9122688/ https://www.ncbi.nlm.nih.gov/pubmed/35602612 http://dx.doi.org/10.1155/2022/2286818 |
_version_ | 1784711397545869312 |
---|---|
author | Chen, Yuhong Peng, Weilong Tang, Keke Khan, Asad Wei, Guodong Fang, Meie |
author_facet | Chen, Yuhong Peng, Weilong Tang, Keke Khan, Asad Wei, Guodong Fang, Meie |
author_sort | Chen, Yuhong |
collection | PubMed |
description | Designing efficient deep learning models for 3D point cloud perception is becoming a major research direction. Point-voxel convolution (PVConv) Liu et al. (2019) is a pioneering research work in this topic. However, since with quite a few layers of simple 3D convolutions and linear point-voxel feature fusion operations, it still has considerable room for improvement in performance. In this paper, we propose a novel pyramid point-voxel convolution (PyraPVConv) block with two key structural modifications to address the above issues. First, PyraPVConv uses a voxel pyramid module to fully extract voxel features in the manner of feature pyramid, such that sufficient voxel features can be obtained efficiently. Second, a sharable attention module is utilized to capture compatible features between multi-scale voxels in pyramid and point cloud for aggregation, as well as to reduce the complexity via structure sharing. Extensive results on three point cloud perception tasks, i.e., indoor scene segmentation, object part segmentation and 3D object detection, validate that the networks constructed by stacking PyraPVConv blocks are efficient in terms of both GPU memory consumption and computational complexity, and are superior to the state-of-the-art methods. |
format | Online Article Text |
id | pubmed-9122688 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-91226882022-05-21 PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention Chen, Yuhong Peng, Weilong Tang, Keke Khan, Asad Wei, Guodong Fang, Meie Comput Intell Neurosci Research Article Designing efficient deep learning models for 3D point cloud perception is becoming a major research direction. Point-voxel convolution (PVConv) Liu et al. (2019) is a pioneering research work in this topic. However, since with quite a few layers of simple 3D convolutions and linear point-voxel feature fusion operations, it still has considerable room for improvement in performance. In this paper, we propose a novel pyramid point-voxel convolution (PyraPVConv) block with two key structural modifications to address the above issues. First, PyraPVConv uses a voxel pyramid module to fully extract voxel features in the manner of feature pyramid, such that sufficient voxel features can be obtained efficiently. Second, a sharable attention module is utilized to capture compatible features between multi-scale voxels in pyramid and point cloud for aggregation, as well as to reduce the complexity via structure sharing. Extensive results on three point cloud perception tasks, i.e., indoor scene segmentation, object part segmentation and 3D object detection, validate that the networks constructed by stacking PyraPVConv blocks are efficient in terms of both GPU memory consumption and computational complexity, and are superior to the state-of-the-art methods. Hindawi 2022-05-13 /pmc/articles/PMC9122688/ /pubmed/35602612 http://dx.doi.org/10.1155/2022/2286818 Text en Copyright © 2022 Yuhong Chen et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Chen, Yuhong Peng, Weilong Tang, Keke Khan, Asad Wei, Guodong Fang, Meie PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention |
title | PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention |
title_full | PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention |
title_fullStr | PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention |
title_full_unstemmed | PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention |
title_short | PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention |
title_sort | pyrapvconv: efficient 3d point cloud perception with pyramid voxel convolution and sharable attention |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9122688/ https://www.ncbi.nlm.nih.gov/pubmed/35602612 http://dx.doi.org/10.1155/2022/2286818 |
work_keys_str_mv | AT chenyuhong pyrapvconvefficient3dpointcloudperceptionwithpyramidvoxelconvolutionandsharableattention AT pengweilong pyrapvconvefficient3dpointcloudperceptionwithpyramidvoxelconvolutionandsharableattention AT tangkeke pyrapvconvefficient3dpointcloudperceptionwithpyramidvoxelconvolutionandsharableattention AT khanasad pyrapvconvefficient3dpointcloudperceptionwithpyramidvoxelconvolutionandsharableattention AT weiguodong pyrapvconvefficient3dpointcloudperceptionwithpyramidvoxelconvolutionandsharableattention AT fangmeie pyrapvconvefficient3dpointcloudperceptionwithpyramidvoxelconvolutionandsharableattention |