Cargando…

Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation

Computation of convolutional neural network (CNN) requires a significant amount of memory access, which leads to lots of energy consumption. As the increase of neural network scale, this phenomenon is further obvious, the energy consumption of memory access and data migration between on-chip buffer...

Descripción completa

Detalles Bibliográficos
Autores principales:	Cheng, Wei-Kai, Liu, Xiang-Yi, Wu, Hsin-Tzu, Pai, Hsin-Yi, Chung, Po-Yao
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8624143/ https://www.ncbi.nlm.nih.gov/pubmed/34832777 http://dx.doi.org/10.3390/mi12111365

_version_	1784606101747007488
author	Cheng, Wei-Kai Liu, Xiang-Yi Wu, Hsin-Tzu Pai, Hsin-Yi Chung, Po-Yao
author_facet	Cheng, Wei-Kai Liu, Xiang-Yi Wu, Hsin-Tzu Pai, Hsin-Yi Chung, Po-Yao
author_sort	Cheng, Wei-Kai
collection	PubMed
description	Computation of convolutional neural network (CNN) requires a significant amount of memory access, which leads to lots of energy consumption. As the increase of neural network scale, this phenomenon is further obvious, the energy consumption of memory access and data migration between on-chip buffer and off-chip DRAM is even much more than the computation energy on processing element array (PE array). In order to reduce the energy consumption of memory access, a better dataflow to maximize data reuse and minimize data migration between on-chip buffer and external DRAM is important. Especially, the dimension of input feature map (ifmap) and filter weight are much different for each layer of the neural network. Hardware resources may not be effectively utilized if the array architecture and dataflow cannot be reconfigured layer by layer according to their ifmap dimension and filter dimension, and result in a large quantity of data migration on certain layers. However, a thorough exploration of all possible configurations is time consuming and meaningless. In this paper, we propose a quick and efficient methodology to adapt the configuration of PE array architecture, buffer assignment, dataflow and reuse methodology layer by layer with the given CNN architecture and hardware resource. In addition, we make an exploration on the different combinations of configuration issues to investigate their effectiveness and can be used as a guide to speed up the thorough exploration process.
format	Online Article Text
id	pubmed-8624143
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-86241432021-11-27 Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation Cheng, Wei-Kai Liu, Xiang-Yi Wu, Hsin-Tzu Pai, Hsin-Yi Chung, Po-Yao Micromachines (Basel) Article Computation of convolutional neural network (CNN) requires a significant amount of memory access, which leads to lots of energy consumption. As the increase of neural network scale, this phenomenon is further obvious, the energy consumption of memory access and data migration between on-chip buffer and off-chip DRAM is even much more than the computation energy on processing element array (PE array). In order to reduce the energy consumption of memory access, a better dataflow to maximize data reuse and minimize data migration between on-chip buffer and external DRAM is important. Especially, the dimension of input feature map (ifmap) and filter weight are much different for each layer of the neural network. Hardware resources may not be effectively utilized if the array architecture and dataflow cannot be reconfigured layer by layer according to their ifmap dimension and filter dimension, and result in a large quantity of data migration on certain layers. However, a thorough exploration of all possible configurations is time consuming and meaningless. In this paper, we propose a quick and efficient methodology to adapt the configuration of PE array architecture, buffer assignment, dataflow and reuse methodology layer by layer with the given CNN architecture and hardware resource. In addition, we make an exploration on the different combinations of configuration issues to investigate their effectiveness and can be used as a guide to speed up the thorough exploration process. MDPI 2021-11-05 /pmc/articles/PMC8624143/ /pubmed/34832777 http://dx.doi.org/10.3390/mi12111365 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Cheng, Wei-Kai Liu, Xiang-Yi Wu, Hsin-Tzu Pai, Hsin-Yi Chung, Po-Yao Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation
title	Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation
title_full	Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation
title_fullStr	Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation
title_full_unstemmed	Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation
title_short	Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation
title_sort	reconfigurable architecture and dataflow for memory traffic minimization of cnns computation
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8624143/ https://www.ncbi.nlm.nih.gov/pubmed/34832777 http://dx.doi.org/10.3390/mi12111365
work_keys_str_mv	AT chengweikai reconfigurablearchitectureanddataflowformemorytrafficminimizationofcnnscomputation AT liuxiangyi reconfigurablearchitectureanddataflowformemorytrafficminimizationofcnnscomputation AT wuhsintzu reconfigurablearchitectureanddataflowformemorytrafficminimizationofcnnscomputation AT paihsinyi reconfigurablearchitectureanddataflowformemorytrafficminimizationofcnnscomputation AT chungpoyao reconfigurablearchitectureanddataflowformemorytrafficminimizationofcnnscomputation

Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation

Ejemplares similares