Cargando…

A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks

The increase in sophistication of neural network models in recent years has exponentially expanded memory consumption and computational cost, thereby hindering their applications on ASIC, FPGA, and other mobile devices. Therefore, compressing and accelerating the neural networks are necessary. In th...

Descripción completa

Detalles Bibliográficos
Autores principales: Long, Xin, Zeng, XiangRong, Ben, Zongcheng, Zhou, Dianle, Zhang, Maojun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7049432/
https://www.ncbi.nlm.nih.gov/pubmed/32148472
http://dx.doi.org/10.1155/2020/7839064
_version_ 1783502438920093696
author Long, Xin
Zeng, XiangRong
Ben, Zongcheng
Zhou, Dianle
Zhang, Maojun
author_facet Long, Xin
Zeng, XiangRong
Ben, Zongcheng
Zhou, Dianle
Zhang, Maojun
author_sort Long, Xin
collection PubMed
description The increase in sophistication of neural network models in recent years has exponentially expanded memory consumption and computational cost, thereby hindering their applications on ASIC, FPGA, and other mobile devices. Therefore, compressing and accelerating the neural networks are necessary. In this study, we introduce a novel strategy to train low-bit networks with weights and activations quantized by several bits and address two corresponding fundamental issues. One is to approximate activations through low-bit discretization for decreasing network computational cost and dot-product memory. The other is to specify weight quantization and update mechanism for discrete weights to avoid gradient mismatch. With quantized low-bit weights and activations, the costly full-precision operation will be replaced by shift operation. We evaluate the proposed method on common datasets, and results show that this method can dramatically compress the neural network with slight accuracy loss.
format Online
Article
Text
id pubmed-7049432
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-70494322020-03-07 A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks Long, Xin Zeng, XiangRong Ben, Zongcheng Zhou, Dianle Zhang, Maojun Comput Intell Neurosci Research Article The increase in sophistication of neural network models in recent years has exponentially expanded memory consumption and computational cost, thereby hindering their applications on ASIC, FPGA, and other mobile devices. Therefore, compressing and accelerating the neural networks are necessary. In this study, we introduce a novel strategy to train low-bit networks with weights and activations quantized by several bits and address two corresponding fundamental issues. One is to approximate activations through low-bit discretization for decreasing network computational cost and dot-product memory. The other is to specify weight quantization and update mechanism for discrete weights to avoid gradient mismatch. With quantized low-bit weights and activations, the costly full-precision operation will be replaced by shift operation. We evaluate the proposed method on common datasets, and results show that this method can dramatically compress the neural network with slight accuracy loss. Hindawi 2020-02-18 /pmc/articles/PMC7049432/ /pubmed/32148472 http://dx.doi.org/10.1155/2020/7839064 Text en Copyright © 2020 Xin Long et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Long, Xin
Zeng, XiangRong
Ben, Zongcheng
Zhou, Dianle
Zhang, Maojun
A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks
title A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks
title_full A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks
title_fullStr A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks
title_full_unstemmed A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks
title_short A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks
title_sort novel low-bit quantization strategy for compressing deep neural networks
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7049432/
https://www.ncbi.nlm.nih.gov/pubmed/32148472
http://dx.doi.org/10.1155/2020/7839064
work_keys_str_mv AT longxin anovellowbitquantizationstrategyforcompressingdeepneuralnetworks
AT zengxiangrong anovellowbitquantizationstrategyforcompressingdeepneuralnetworks
AT benzongcheng anovellowbitquantizationstrategyforcompressingdeepneuralnetworks
AT zhoudianle anovellowbitquantizationstrategyforcompressingdeepneuralnetworks
AT zhangmaojun anovellowbitquantizationstrategyforcompressingdeepneuralnetworks
AT longxin novellowbitquantizationstrategyforcompressingdeepneuralnetworks
AT zengxiangrong novellowbitquantizationstrategyforcompressingdeepneuralnetworks
AT benzongcheng novellowbitquantizationstrategyforcompressingdeepneuralnetworks
AT zhoudianle novellowbitquantizationstrategyforcompressingdeepneuralnetworks
AT zhangmaojun novellowbitquantizationstrategyforcompressingdeepneuralnetworks