Cargando…

Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations

In a CNN (convolutional neural network) accelerator, to reduce memory traffic and power consumption, there is a need to exploit the sparsity of activation values. Therefore, some research efforts have been paid to skip ineffectual computations (i.e., multiplications by zero). Different from previous...

Descripción completa

Detalles Bibliográficos
Autores principales: Weng, Yui-Kai, Huang, Shih-Hsu, Kao, Hsu-Yu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8622461/
https://www.ncbi.nlm.nih.gov/pubmed/34833543
http://dx.doi.org/10.3390/s21227468
_version_ 1784605699273129984
author Weng, Yui-Kai
Huang, Shih-Hsu
Kao, Hsu-Yu
author_facet Weng, Yui-Kai
Huang, Shih-Hsu
Kao, Hsu-Yu
author_sort Weng, Yui-Kai
collection PubMed
description In a CNN (convolutional neural network) accelerator, to reduce memory traffic and power consumption, there is a need to exploit the sparsity of activation values. Therefore, some research efforts have been paid to skip ineffectual computations (i.e., multiplications by zero). Different from previous works, in this paper, we point out the similarity of activation values: (1) in the same layer of a CNN model, most feature maps are either highly dense or highly sparse; (2) in the same layer of a CNN model, feature maps in different channels are often similar. Based on the two observations, we propose a block-based compression approach, which utilizes both the sparsity and the similarity of activation values to further reduce the data volume. Moreover, we also design an encoder, a decoder and an indexing module to support the proposed approach. The encoder is used to translate output activations into the proposed block-based compression format, while both the decoder and the indexing module are used to align nonzero values for effectual computations. Compared with previous works, benchmark data consistently show that the proposed approach can greatly reduce both memory traffic and power consumption.
format Online
Article
Text
id pubmed-8622461
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-86224612021-11-27 Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations Weng, Yui-Kai Huang, Shih-Hsu Kao, Hsu-Yu Sensors (Basel) Article In a CNN (convolutional neural network) accelerator, to reduce memory traffic and power consumption, there is a need to exploit the sparsity of activation values. Therefore, some research efforts have been paid to skip ineffectual computations (i.e., multiplications by zero). Different from previous works, in this paper, we point out the similarity of activation values: (1) in the same layer of a CNN model, most feature maps are either highly dense or highly sparse; (2) in the same layer of a CNN model, feature maps in different channels are often similar. Based on the two observations, we propose a block-based compression approach, which utilizes both the sparsity and the similarity of activation values to further reduce the data volume. Moreover, we also design an encoder, a decoder and an indexing module to support the proposed approach. The encoder is used to translate output activations into the proposed block-based compression format, while both the decoder and the indexing module are used to align nonzero values for effectual computations. Compared with previous works, benchmark data consistently show that the proposed approach can greatly reduce both memory traffic and power consumption. MDPI 2021-11-10 /pmc/articles/PMC8622461/ /pubmed/34833543 http://dx.doi.org/10.3390/s21227468 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Weng, Yui-Kai
Huang, Shih-Hsu
Kao, Hsu-Yu
Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations
title Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations
title_full Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations
title_fullStr Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations
title_full_unstemmed Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations
title_short Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations
title_sort block-based compression and corresponding hardware circuits for sparse activations
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8622461/
https://www.ncbi.nlm.nih.gov/pubmed/34833543
http://dx.doi.org/10.3390/s21227468
work_keys_str_mv AT wengyuikai blockbasedcompressionandcorrespondinghardwarecircuitsforsparseactivations
AT huangshihhsu blockbasedcompressionandcorrespondinghardwarecircuitsforsparseactivations
AT kaohsuyu blockbasedcompressionandcorrespondinghardwarecircuitsforsparseactivations