Cargando…

Deep Learning Accelerators’ Configuration Space Exploration Effect on Performance and Resource Utilization: A Gemmini Case Study

Though custom deep learning (DL) hardware accelerators are attractive for making inferences in edge computing devices, their design and implementation remain a challenge. Open-source frameworks exist for exploring DL hardware accelerators. Gemmini is an open-source systolic array generator for agile...

Descripción completa

Detalles Bibliográficos
Autores principales:	Gookyi, Dennis Agyemanh Nana, Lee, Eunchong, Kim, Kyungho, Jang, Sung-Joon, Lee, Sang-Seol
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10007457/ https://www.ncbi.nlm.nih.gov/pubmed/36904584 http://dx.doi.org/10.3390/s23052380

_version_	1784905526259220480
author	Gookyi, Dennis Agyemanh Nana Lee, Eunchong Kim, Kyungho Jang, Sung-Joon Lee, Sang-Seol
author_facet	Gookyi, Dennis Agyemanh Nana Lee, Eunchong Kim, Kyungho Jang, Sung-Joon Lee, Sang-Seol
author_sort	Gookyi, Dennis Agyemanh Nana
collection	PubMed
description	Though custom deep learning (DL) hardware accelerators are attractive for making inferences in edge computing devices, their design and implementation remain a challenge. Open-source frameworks exist for exploring DL hardware accelerators. Gemmini is an open-source systolic array generator for agile DL accelerator exploration. This paper details the hardware/software components generated using Gemmini. The general matrix-to-matrix multiplication (GEMM) of different dataflow options, including output/weight stationary (OS/WS), was explored in Gemmini to estimate the performance relative to a CPU implementation. The Gemmini hardware was implemented on an FPGA device to explore the effect of several accelerator parameters, including array size, memory capacity, and the CPU/hardware image-to-column (im2col) module, on metrics such as the area, frequency, and power. This work revealed that regarding the performance, the WS dataflow offered a speedup of 3× relative to the OS dataflow, and the hardware im2col operation offered a speedup of 1.1× relative to the operation on the CPU. For hardware resources, an increase in the array size by a factor of 2 led to an increase in both the area and power by a factor of 3.3, and the im2col module led to an increase in area and power by factors of 1.01 and 1.06, respectively.
format	Online Article Text
id	pubmed-10007457
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-100074572023-03-12 Deep Learning Accelerators’ Configuration Space Exploration Effect on Performance and Resource Utilization: A Gemmini Case Study Gookyi, Dennis Agyemanh Nana Lee, Eunchong Kim, Kyungho Jang, Sung-Joon Lee, Sang-Seol Sensors (Basel) Article Though custom deep learning (DL) hardware accelerators are attractive for making inferences in edge computing devices, their design and implementation remain a challenge. Open-source frameworks exist for exploring DL hardware accelerators. Gemmini is an open-source systolic array generator for agile DL accelerator exploration. This paper details the hardware/software components generated using Gemmini. The general matrix-to-matrix multiplication (GEMM) of different dataflow options, including output/weight stationary (OS/WS), was explored in Gemmini to estimate the performance relative to a CPU implementation. The Gemmini hardware was implemented on an FPGA device to explore the effect of several accelerator parameters, including array size, memory capacity, and the CPU/hardware image-to-column (im2col) module, on metrics such as the area, frequency, and power. This work revealed that regarding the performance, the WS dataflow offered a speedup of 3× relative to the OS dataflow, and the hardware im2col operation offered a speedup of 1.1× relative to the operation on the CPU. For hardware resources, an increase in the array size by a factor of 2 led to an increase in both the area and power by a factor of 3.3, and the im2col module led to an increase in area and power by factors of 1.01 and 1.06, respectively. MDPI 2023-02-21 /pmc/articles/PMC10007457/ /pubmed/36904584 http://dx.doi.org/10.3390/s23052380 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Gookyi, Dennis Agyemanh Nana Lee, Eunchong Kim, Kyungho Jang, Sung-Joon Lee, Sang-Seol Deep Learning Accelerators’ Configuration Space Exploration Effect on Performance and Resource Utilization: A Gemmini Case Study
title	Deep Learning Accelerators’ Configuration Space Exploration Effect on Performance and Resource Utilization: A Gemmini Case Study
title_full	Deep Learning Accelerators’ Configuration Space Exploration Effect on Performance and Resource Utilization: A Gemmini Case Study
title_fullStr	Deep Learning Accelerators’ Configuration Space Exploration Effect on Performance and Resource Utilization: A Gemmini Case Study
title_full_unstemmed	Deep Learning Accelerators’ Configuration Space Exploration Effect on Performance and Resource Utilization: A Gemmini Case Study
title_short	Deep Learning Accelerators’ Configuration Space Exploration Effect on Performance and Resource Utilization: A Gemmini Case Study
title_sort	deep learning accelerators’ configuration space exploration effect on performance and resource utilization: a gemmini case study
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10007457/ https://www.ncbi.nlm.nih.gov/pubmed/36904584 http://dx.doi.org/10.3390/s23052380
work_keys_str_mv	AT gookyidennisagyemanhnana deeplearningacceleratorsconfigurationspaceexplorationeffectonperformanceandresourceutilizationagemminicasestudy AT leeeunchong deeplearningacceleratorsconfigurationspaceexplorationeffectonperformanceandresourceutilizationagemminicasestudy AT kimkyungho deeplearningacceleratorsconfigurationspaceexplorationeffectonperformanceandresourceutilizationagemminicasestudy AT jangsungjoon deeplearningacceleratorsconfigurationspaceexplorationeffectonperformanceandresourceutilizationagemminicasestudy AT leesangseol deeplearningacceleratorsconfigurationspaceexplorationeffectonperformanceandresourceutilizationagemminicasestudy

Deep Learning Accelerators’ Configuration Space Exploration Effect on Performance and Resource Utilization: A Gemmini Case Study

Ejemplares similares