Cargando…

Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations

In a CNN (convolutional neural network) accelerator, to reduce memory traffic and power consumption, there is a need to exploit the sparsity of activation values. Therefore, some research efforts have been paid to skip ineffectual computations (i.e., multiplications by zero). Different from previous...

Descripción completa

Detalles Bibliográficos
Autores principales:	Weng, Yui-Kai, Huang, Shih-Hsu, Kao, Hsu-Yu
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8622461/ https://www.ncbi.nlm.nih.gov/pubmed/34833543 http://dx.doi.org/10.3390/s21227468

_version_	1784605699273129984
author	Weng, Yui-Kai Huang, Shih-Hsu Kao, Hsu-Yu
author_facet	Weng, Yui-Kai Huang, Shih-Hsu Kao, Hsu-Yu
author_sort	Weng, Yui-Kai
collection	PubMed
description	In a CNN (convolutional neural network) accelerator, to reduce memory traffic and power consumption, there is a need to exploit the sparsity of activation values. Therefore, some research efforts have been paid to skip ineffectual computations (i.e., multiplications by zero). Different from previous works, in this paper, we point out the similarity of activation values: (1) in the same layer of a CNN model, most feature maps are either highly dense or highly sparse; (2) in the same layer of a CNN model, feature maps in different channels are often similar. Based on the two observations, we propose a block-based compression approach, which utilizes both the sparsity and the similarity of activation values to further reduce the data volume. Moreover, we also design an encoder, a decoder and an indexing module to support the proposed approach. The encoder is used to translate output activations into the proposed block-based compression format, while both the decoder and the indexing module are used to align nonzero values for effectual computations. Compared with previous works, benchmark data consistently show that the proposed approach can greatly reduce both memory traffic and power consumption.
format	Online Article Text
id	pubmed-8622461
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-86224612021-11-27 Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations Weng, Yui-Kai Huang, Shih-Hsu Kao, Hsu-Yu Sensors (Basel) Article In a CNN (convolutional neural network) accelerator, to reduce memory traffic and power consumption, there is a need to exploit the sparsity of activation values. Therefore, some research efforts have been paid to skip ineffectual computations (i.e., multiplications by zero). Different from previous works, in this paper, we point out the similarity of activation values: (1) in the same layer of a CNN model, most feature maps are either highly dense or highly sparse; (2) in the same layer of a CNN model, feature maps in different channels are often similar. Based on the two observations, we propose a block-based compression approach, which utilizes both the sparsity and the similarity of activation values to further reduce the data volume. Moreover, we also design an encoder, a decoder and an indexing module to support the proposed approach. The encoder is used to translate output activations into the proposed block-based compression format, while both the decoder and the indexing module are used to align nonzero values for effectual computations. Compared with previous works, benchmark data consistently show that the proposed approach can greatly reduce both memory traffic and power consumption. MDPI 2021-11-10 /pmc/articles/PMC8622461/ /pubmed/34833543 http://dx.doi.org/10.3390/s21227468 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Weng, Yui-Kai Huang, Shih-Hsu Kao, Hsu-Yu Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations
title	Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations
title_full	Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations
title_fullStr	Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations
title_full_unstemmed	Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations
title_short	Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations
title_sort	block-based compression and corresponding hardware circuits for sparse activations
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8622461/ https://www.ncbi.nlm.nih.gov/pubmed/34833543 http://dx.doi.org/10.3390/s21227468
work_keys_str_mv	AT wengyuikai blockbasedcompressionandcorrespondinghardwarecircuitsforsparseactivations AT huangshihhsu blockbasedcompressionandcorrespondinghardwarecircuitsforsparseactivations AT kaohsuyu blockbasedcompressionandcorrespondinghardwarecircuitsforsparseactivations

Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations

Ejemplares similares