Cargando…

Efficient Layer-Wise N:M Sparse CNN Accelerator with Flexible SPEC: Sparse Processing Element Clusters

Recently, the layer-wise N:M fine-grained sparse neural network algorithm (i.e., every M-weights contains N non-zero values) has attracted tremendous attention, as it can effectively reduce the computational complexity with negligible accuracy loss. However, the speed-up potential of this algorithm...

Descripción completa

Detalles Bibliográficos
Autores principales:	Xie, Xiaoru, Zhu, Mingyu, Lu, Siyuan, Wang, Zhongfeng
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10057003/ https://www.ncbi.nlm.nih.gov/pubmed/36984936 http://dx.doi.org/10.3390/mi14030528

_version_	1785016258848096256
author	Xie, Xiaoru Zhu, Mingyu Lu, Siyuan Wang, Zhongfeng
author_facet	Xie, Xiaoru Zhu, Mingyu Lu, Siyuan Wang, Zhongfeng
author_sort	Xie, Xiaoru
collection	PubMed
description	Recently, the layer-wise N:M fine-grained sparse neural network algorithm (i.e., every M-weights contains N non-zero values) has attracted tremendous attention, as it can effectively reduce the computational complexity with negligible accuracy loss. However, the speed-up potential of this algorithm will not be fully exploited if the right hardware support is lacking. In this work, we design an efficient accelerator for the N:M sparse convolutional neural networks (CNNs) with layer-wise sparse patterns. First, we analyze the performances of different processing element (PE) structures and extensions to construct the flexible PE architecture. Second, the variable sparse convolutional dimensions and sparse ratios are involved in the hardware design. With a sparse PE cluster (SPEC) design, the hardware can efficiently accelerate CNNs with the layer-wise N:M pattern. Finally, we employ the proposed SPEC into the CNN accelerator with flexible network-on-chip and specially designed dataflow. We implement hardware accelerators on Xilinx ZCU102 FPGA and Xilinx VCU118 FPGA and evaluate them with classical CNNs such as Alexnet, VGG-16, and ResNet-50. Compared with existing accelerators designed for structured and unstructured pruned networks, our design achieves the best performance in terms of power efficiency.
format	Online Article Text
id	pubmed-10057003
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-100570032023-03-30 Efficient Layer-Wise N:M Sparse CNN Accelerator with Flexible SPEC: Sparse Processing Element Clusters Xie, Xiaoru Zhu, Mingyu Lu, Siyuan Wang, Zhongfeng Micromachines (Basel) Article Recently, the layer-wise N:M fine-grained sparse neural network algorithm (i.e., every M-weights contains N non-zero values) has attracted tremendous attention, as it can effectively reduce the computational complexity with negligible accuracy loss. However, the speed-up potential of this algorithm will not be fully exploited if the right hardware support is lacking. In this work, we design an efficient accelerator for the N:M sparse convolutional neural networks (CNNs) with layer-wise sparse patterns. First, we analyze the performances of different processing element (PE) structures and extensions to construct the flexible PE architecture. Second, the variable sparse convolutional dimensions and sparse ratios are involved in the hardware design. With a sparse PE cluster (SPEC) design, the hardware can efficiently accelerate CNNs with the layer-wise N:M pattern. Finally, we employ the proposed SPEC into the CNN accelerator with flexible network-on-chip and specially designed dataflow. We implement hardware accelerators on Xilinx ZCU102 FPGA and Xilinx VCU118 FPGA and evaluate them with classical CNNs such as Alexnet, VGG-16, and ResNet-50. Compared with existing accelerators designed for structured and unstructured pruned networks, our design achieves the best performance in terms of power efficiency. MDPI 2023-02-24 /pmc/articles/PMC10057003/ /pubmed/36984936 http://dx.doi.org/10.3390/mi14030528 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Xie, Xiaoru Zhu, Mingyu Lu, Siyuan Wang, Zhongfeng Efficient Layer-Wise N:M Sparse CNN Accelerator with Flexible SPEC: Sparse Processing Element Clusters
title	Efficient Layer-Wise N:M Sparse CNN Accelerator with Flexible SPEC: Sparse Processing Element Clusters
title_full	Efficient Layer-Wise N:M Sparse CNN Accelerator with Flexible SPEC: Sparse Processing Element Clusters
title_fullStr	Efficient Layer-Wise N:M Sparse CNN Accelerator with Flexible SPEC: Sparse Processing Element Clusters
title_full_unstemmed	Efficient Layer-Wise N:M Sparse CNN Accelerator with Flexible SPEC: Sparse Processing Element Clusters
title_short	Efficient Layer-Wise N:M Sparse CNN Accelerator with Flexible SPEC: Sparse Processing Element Clusters
title_sort	efficient layer-wise n:m sparse cnn accelerator with flexible spec: sparse processing element clusters
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10057003/ https://www.ncbi.nlm.nih.gov/pubmed/36984936 http://dx.doi.org/10.3390/mi14030528
work_keys_str_mv	AT xiexiaoru efficientlayerwisenmsparsecnnacceleratorwithflexiblespecsparseprocessingelementclusters AT zhumingyu efficientlayerwisenmsparsecnnacceleratorwithflexiblespecsparseprocessingelementclusters AT lusiyuan efficientlayerwisenmsparsecnnacceleratorwithflexiblespecsparseprocessingelementclusters AT wangzhongfeng efficientlayerwisenmsparsecnnacceleratorwithflexiblespecsparseprocessingelementclusters

Efficient Layer-Wise N:M Sparse CNN Accelerator with Flexible SPEC: Sparse Processing Element Clusters

Ejemplares similares