Cargando…

Optimizing the Deep Neural Networks by Layer-Wise Refined Pruning and the Acceleration on FPGA

To accelerate the practical applications of artificial intelligence, this paper proposes a high efficient layer-wise refined pruning method for deep neural networks at the software level and accelerates the inference process at the hardware level on a field-programmable gate array (FPGA). The refine...

Descripción completa

Detalles Bibliográficos
Autores principales:	Li, Hengyi, Yue, Xuebin, Wang, Zhichen, Chai, Zhilei, Wang, Wenwen, Tomiyama, Hiroyuki, Meng, Lin
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi 2022
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9177312/ https://www.ncbi.nlm.nih.gov/pubmed/35694575 http://dx.doi.org/10.1155/2022/8039281

_version_	1784722859010031616
author	Li, Hengyi Yue, Xuebin Wang, Zhichen Chai, Zhilei Wang, Wenwen Tomiyama, Hiroyuki Meng, Lin
author_facet	Li, Hengyi Yue, Xuebin Wang, Zhichen Chai, Zhilei Wang, Wenwen Tomiyama, Hiroyuki Meng, Lin
author_sort	Li, Hengyi
collection	PubMed
description	To accelerate the practical applications of artificial intelligence, this paper proposes a high efficient layer-wise refined pruning method for deep neural networks at the software level and accelerates the inference process at the hardware level on a field-programmable gate array (FPGA). The refined pruning operation is based on the channel-wise importance indexes of each layer and the layer-wise input sparsity of convolutional layers. The method utilizes the characteristics of the native networks without introducing any extra workloads to the training phase. In addition, the operation is easy to be extended to various state-of-the-art deep neural networks. The effectiveness of the method is verified on ResNet architecture and VGG networks in terms of dataset CIFAR10, CIFAR100, and ImageNet100. Experimental results show that in terms of ResNet50 on CIFAR10 and ResNet101 on CIFAR100, more than 85% of parameters and Floating-Point Operations are pruned with only 0.35% and 0.40% accuracy loss, respectively. As for the VGG network, 87.05% of parameters and 75.78% of Floating-Point Operations are pruned with only 0.74% accuracy loss for VGG13BN on CIFAR10. Furthermore, we accelerate the networks at the hardware level on the FPGA platform by utilizing the tool Vitis AI. For two threads mode in FPGA, the throughput/fps of the pruned VGG13BN and ResNet101 achieves 151.99 fps and 124.31 fps, respectively, and the pruned networks achieve about 4.3× and 1.8× speed up for VGG13BN and ResNet101, respectively, compared with the original networks on FPGA.
format	Online Article Text
id	pubmed-9177312
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Hindawi
record_format	MEDLINE/PubMed
spelling	pubmed-91773122022-06-09 Optimizing the Deep Neural Networks by Layer-Wise Refined Pruning and the Acceleration on FPGA Li, Hengyi Yue, Xuebin Wang, Zhichen Chai, Zhilei Wang, Wenwen Tomiyama, Hiroyuki Meng, Lin Comput Intell Neurosci Research Article To accelerate the practical applications of artificial intelligence, this paper proposes a high efficient layer-wise refined pruning method for deep neural networks at the software level and accelerates the inference process at the hardware level on a field-programmable gate array (FPGA). The refined pruning operation is based on the channel-wise importance indexes of each layer and the layer-wise input sparsity of convolutional layers. The method utilizes the characteristics of the native networks without introducing any extra workloads to the training phase. In addition, the operation is easy to be extended to various state-of-the-art deep neural networks. The effectiveness of the method is verified on ResNet architecture and VGG networks in terms of dataset CIFAR10, CIFAR100, and ImageNet100. Experimental results show that in terms of ResNet50 on CIFAR10 and ResNet101 on CIFAR100, more than 85% of parameters and Floating-Point Operations are pruned with only 0.35% and 0.40% accuracy loss, respectively. As for the VGG network, 87.05% of parameters and 75.78% of Floating-Point Operations are pruned with only 0.74% accuracy loss for VGG13BN on CIFAR10. Furthermore, we accelerate the networks at the hardware level on the FPGA platform by utilizing the tool Vitis AI. For two threads mode in FPGA, the throughput/fps of the pruned VGG13BN and ResNet101 achieves 151.99 fps and 124.31 fps, respectively, and the pruned networks achieve about 4.3× and 1.8× speed up for VGG13BN and ResNet101, respectively, compared with the original networks on FPGA. Hindawi 2022-06-01 /pmc/articles/PMC9177312/ /pubmed/35694575 http://dx.doi.org/10.1155/2022/8039281 Text en Copyright © 2022 Hengyi Li et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Li, Hengyi Yue, Xuebin Wang, Zhichen Chai, Zhilei Wang, Wenwen Tomiyama, Hiroyuki Meng, Lin Optimizing the Deep Neural Networks by Layer-Wise Refined Pruning and the Acceleration on FPGA
title	Optimizing the Deep Neural Networks by Layer-Wise Refined Pruning and the Acceleration on FPGA
title_full	Optimizing the Deep Neural Networks by Layer-Wise Refined Pruning and the Acceleration on FPGA
title_fullStr	Optimizing the Deep Neural Networks by Layer-Wise Refined Pruning and the Acceleration on FPGA
title_full_unstemmed	Optimizing the Deep Neural Networks by Layer-Wise Refined Pruning and the Acceleration on FPGA
title_short	Optimizing the Deep Neural Networks by Layer-Wise Refined Pruning and the Acceleration on FPGA
title_sort	optimizing the deep neural networks by layer-wise refined pruning and the acceleration on fpga
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9177312/ https://www.ncbi.nlm.nih.gov/pubmed/35694575 http://dx.doi.org/10.1155/2022/8039281
work_keys_str_mv	AT lihengyi optimizingthedeepneuralnetworksbylayerwiserefinedpruningandtheaccelerationonfpga AT yuexuebin optimizingthedeepneuralnetworksbylayerwiserefinedpruningandtheaccelerationonfpga AT wangzhichen optimizingthedeepneuralnetworksbylayerwiserefinedpruningandtheaccelerationonfpga AT chaizhilei optimizingthedeepneuralnetworksbylayerwiserefinedpruningandtheaccelerationonfpga AT wangwenwen optimizingthedeepneuralnetworksbylayerwiserefinedpruningandtheaccelerationonfpga AT tomiyamahiroyuki optimizingthedeepneuralnetworksbylayerwiserefinedpruningandtheaccelerationonfpga AT menglin optimizingthedeepneuralnetworksbylayerwiserefinedpruningandtheaccelerationonfpga

Optimizing the Deep Neural Networks by Layer-Wise Refined Pruning and the Acceleration on FPGA

Ejemplares similares