Cargando…

A Hardware-Friendly Low-Bit Power-of-Two Quantization Method for CNNs and Its FPGA Implementation

To address the problems of convolutional neural networks (CNNs) consuming more hardware resources (such as DSPs and RAMs on FPGAs) and their accuracy, efficiency, and resources being difficult to balance, meaning they cannot meet the requirements of industrial applications, we proposed an innovative...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sui, Xuefu, Lv, Qunbo, Bai, Yang, Zhu, Baoyu, Zhi, Liangjie, Yang, Yuanbo, Tan, Zheng
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9460272/ https://www.ncbi.nlm.nih.gov/pubmed/36081072 http://dx.doi.org/10.3390/s22176618

_version_	1784786706569887744
author	Sui, Xuefu Lv, Qunbo Bai, Yang Zhu, Baoyu Zhi, Liangjie Yang, Yuanbo Tan, Zheng
author_facet	Sui, Xuefu Lv, Qunbo Bai, Yang Zhu, Baoyu Zhi, Liangjie Yang, Yuanbo Tan, Zheng
author_sort	Sui, Xuefu
collection	PubMed
description	To address the problems of convolutional neural networks (CNNs) consuming more hardware resources (such as DSPs and RAMs on FPGAs) and their accuracy, efficiency, and resources being difficult to balance, meaning they cannot meet the requirements of industrial applications, we proposed an innovative low-bit power-of-two quantization method: the global sign-based network quantization (GSNQ). This method involves designing different quantization ranges according to the sign of the weights, which can provide a larger quantization-value range. Combined with the fine-grained and multi-scale global retraining method proposed in this paper, the accuracy loss of low-bit quantization can be effectively reduced. We also proposed a novel convolutional algorithm using shift operations to replace multiplication to help to deploy the GSNQ quantized models on FPGAs. Quantization comparison experiments performed on LeNet-5, AlexNet, VGG-Net, ResNet, and GoogLeNet showed that GSNQ has higher accuracy than most existing methods and achieves “lossless” quantization (i.e., the accuracy of the quantized CNN model is higher than the baseline) at low-bit quantization in most cases. FPGA comparison experiments showed that our convolutional algorithm does not occupy on-chip DSPs, and it also has a low comprehensive occupancy in terms of on-chip LUTs and FFs, which can effectively improve the computational parallelism, and this proves that GSNQ has good hardware-adaptation capability. This study provides theoretical and experimental support for the industrial application of CNNs.
format	Online Article Text
id	pubmed-9460272
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-94602722022-09-10 A Hardware-Friendly Low-Bit Power-of-Two Quantization Method for CNNs and Its FPGA Implementation Sui, Xuefu Lv, Qunbo Bai, Yang Zhu, Baoyu Zhi, Liangjie Yang, Yuanbo Tan, Zheng Sensors (Basel) Article To address the problems of convolutional neural networks (CNNs) consuming more hardware resources (such as DSPs and RAMs on FPGAs) and their accuracy, efficiency, and resources being difficult to balance, meaning they cannot meet the requirements of industrial applications, we proposed an innovative low-bit power-of-two quantization method: the global sign-based network quantization (GSNQ). This method involves designing different quantization ranges according to the sign of the weights, which can provide a larger quantization-value range. Combined with the fine-grained and multi-scale global retraining method proposed in this paper, the accuracy loss of low-bit quantization can be effectively reduced. We also proposed a novel convolutional algorithm using shift operations to replace multiplication to help to deploy the GSNQ quantized models on FPGAs. Quantization comparison experiments performed on LeNet-5, AlexNet, VGG-Net, ResNet, and GoogLeNet showed that GSNQ has higher accuracy than most existing methods and achieves “lossless” quantization (i.e., the accuracy of the quantized CNN model is higher than the baseline) at low-bit quantization in most cases. FPGA comparison experiments showed that our convolutional algorithm does not occupy on-chip DSPs, and it also has a low comprehensive occupancy in terms of on-chip LUTs and FFs, which can effectively improve the computational parallelism, and this proves that GSNQ has good hardware-adaptation capability. This study provides theoretical and experimental support for the industrial application of CNNs. MDPI 2022-09-01 /pmc/articles/PMC9460272/ /pubmed/36081072 http://dx.doi.org/10.3390/s22176618 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Sui, Xuefu Lv, Qunbo Bai, Yang Zhu, Baoyu Zhi, Liangjie Yang, Yuanbo Tan, Zheng A Hardware-Friendly Low-Bit Power-of-Two Quantization Method for CNNs and Its FPGA Implementation
title	A Hardware-Friendly Low-Bit Power-of-Two Quantization Method for CNNs and Its FPGA Implementation
title_full	A Hardware-Friendly Low-Bit Power-of-Two Quantization Method for CNNs and Its FPGA Implementation
title_fullStr	A Hardware-Friendly Low-Bit Power-of-Two Quantization Method for CNNs and Its FPGA Implementation
title_full_unstemmed	A Hardware-Friendly Low-Bit Power-of-Two Quantization Method for CNNs and Its FPGA Implementation
title_short	A Hardware-Friendly Low-Bit Power-of-Two Quantization Method for CNNs and Its FPGA Implementation
title_sort	hardware-friendly low-bit power-of-two quantization method for cnns and its fpga implementation
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9460272/ https://www.ncbi.nlm.nih.gov/pubmed/36081072 http://dx.doi.org/10.3390/s22176618
work_keys_str_mv	AT suixuefu ahardwarefriendlylowbitpoweroftwoquantizationmethodforcnnsanditsfpgaimplementation AT lvqunbo ahardwarefriendlylowbitpoweroftwoquantizationmethodforcnnsanditsfpgaimplementation AT baiyang ahardwarefriendlylowbitpoweroftwoquantizationmethodforcnnsanditsfpgaimplementation AT zhubaoyu ahardwarefriendlylowbitpoweroftwoquantizationmethodforcnnsanditsfpgaimplementation AT zhiliangjie ahardwarefriendlylowbitpoweroftwoquantizationmethodforcnnsanditsfpgaimplementation AT yangyuanbo ahardwarefriendlylowbitpoweroftwoquantizationmethodforcnnsanditsfpgaimplementation AT tanzheng ahardwarefriendlylowbitpoweroftwoquantizationmethodforcnnsanditsfpgaimplementation AT suixuefu hardwarefriendlylowbitpoweroftwoquantizationmethodforcnnsanditsfpgaimplementation AT lvqunbo hardwarefriendlylowbitpoweroftwoquantizationmethodforcnnsanditsfpgaimplementation AT baiyang hardwarefriendlylowbitpoweroftwoquantizationmethodforcnnsanditsfpgaimplementation AT zhubaoyu hardwarefriendlylowbitpoweroftwoquantizationmethodforcnnsanditsfpgaimplementation AT zhiliangjie hardwarefriendlylowbitpoweroftwoquantizationmethodforcnnsanditsfpgaimplementation AT yangyuanbo hardwarefriendlylowbitpoweroftwoquantizationmethodforcnnsanditsfpgaimplementation AT tanzheng hardwarefriendlylowbitpoweroftwoquantizationmethodforcnnsanditsfpgaimplementation

A Hardware-Friendly Low-Bit Power-of-Two Quantization Method for CNNs and Its FPGA Implementation

Ejemplares similares