Cargando…
A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks
The increase in sophistication of neural network models in recent years has exponentially expanded memory consumption and computational cost, thereby hindering their applications on ASIC, FPGA, and other mobile devices. Therefore, compressing and accelerating the neural networks are necessary. In th...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7049432/ https://www.ncbi.nlm.nih.gov/pubmed/32148472 http://dx.doi.org/10.1155/2020/7839064 |
_version_ | 1783502438920093696 |
---|---|
author | Long, Xin Zeng, XiangRong Ben, Zongcheng Zhou, Dianle Zhang, Maojun |
author_facet | Long, Xin Zeng, XiangRong Ben, Zongcheng Zhou, Dianle Zhang, Maojun |
author_sort | Long, Xin |
collection | PubMed |
description | The increase in sophistication of neural network models in recent years has exponentially expanded memory consumption and computational cost, thereby hindering their applications on ASIC, FPGA, and other mobile devices. Therefore, compressing and accelerating the neural networks are necessary. In this study, we introduce a novel strategy to train low-bit networks with weights and activations quantized by several bits and address two corresponding fundamental issues. One is to approximate activations through low-bit discretization for decreasing network computational cost and dot-product memory. The other is to specify weight quantization and update mechanism for discrete weights to avoid gradient mismatch. With quantized low-bit weights and activations, the costly full-precision operation will be replaced by shift operation. We evaluate the proposed method on common datasets, and results show that this method can dramatically compress the neural network with slight accuracy loss. |
format | Online Article Text |
id | pubmed-7049432 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-70494322020-03-07 A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks Long, Xin Zeng, XiangRong Ben, Zongcheng Zhou, Dianle Zhang, Maojun Comput Intell Neurosci Research Article The increase in sophistication of neural network models in recent years has exponentially expanded memory consumption and computational cost, thereby hindering their applications on ASIC, FPGA, and other mobile devices. Therefore, compressing and accelerating the neural networks are necessary. In this study, we introduce a novel strategy to train low-bit networks with weights and activations quantized by several bits and address two corresponding fundamental issues. One is to approximate activations through low-bit discretization for decreasing network computational cost and dot-product memory. The other is to specify weight quantization and update mechanism for discrete weights to avoid gradient mismatch. With quantized low-bit weights and activations, the costly full-precision operation will be replaced by shift operation. We evaluate the proposed method on common datasets, and results show that this method can dramatically compress the neural network with slight accuracy loss. Hindawi 2020-02-18 /pmc/articles/PMC7049432/ /pubmed/32148472 http://dx.doi.org/10.1155/2020/7839064 Text en Copyright © 2020 Xin Long et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Long, Xin Zeng, XiangRong Ben, Zongcheng Zhou, Dianle Zhang, Maojun A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks |
title | A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks |
title_full | A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks |
title_fullStr | A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks |
title_full_unstemmed | A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks |
title_short | A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks |
title_sort | novel low-bit quantization strategy for compressing deep neural networks |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7049432/ https://www.ncbi.nlm.nih.gov/pubmed/32148472 http://dx.doi.org/10.1155/2020/7839064 |
work_keys_str_mv | AT longxin anovellowbitquantizationstrategyforcompressingdeepneuralnetworks AT zengxiangrong anovellowbitquantizationstrategyforcompressingdeepneuralnetworks AT benzongcheng anovellowbitquantizationstrategyforcompressingdeepneuralnetworks AT zhoudianle anovellowbitquantizationstrategyforcompressingdeepneuralnetworks AT zhangmaojun anovellowbitquantizationstrategyforcompressingdeepneuralnetworks AT longxin novellowbitquantizationstrategyforcompressingdeepneuralnetworks AT zengxiangrong novellowbitquantizationstrategyforcompressingdeepneuralnetworks AT benzongcheng novellowbitquantizationstrategyforcompressingdeepneuralnetworks AT zhoudianle novellowbitquantizationstrategyforcompressingdeepneuralnetworks AT zhangmaojun novellowbitquantizationstrategyforcompressingdeepneuralnetworks |