Cargando…

A Low-Power Hardware Architecture for Real-Time CNN Computing

Convolutional neural network (CNN) is widely deployed on edge devices, performing tasks such as objective detection, image recognition and acoustic recognition. However, the limited resources and strict power constraints of edge devices pose a great challenge to applying the computationally intensiv...

Descripción completa

Detalles Bibliográficos
Autores principales:	Liu, Xinyu, Cao, Chenhong, Duan, Shengyu
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9965634/ https://www.ncbi.nlm.nih.gov/pubmed/36850642 http://dx.doi.org/10.3390/s23042045

_version_	1784896813966295040
author	Liu, Xinyu Cao, Chenhong Duan, Shengyu
author_facet	Liu, Xinyu Cao, Chenhong Duan, Shengyu
author_sort	Liu, Xinyu
collection	PubMed
description	Convolutional neural network (CNN) is widely deployed on edge devices, performing tasks such as objective detection, image recognition and acoustic recognition. However, the limited resources and strict power constraints of edge devices pose a great challenge to applying the computationally intensive CNN models. In addition, for the edge applications with real-time requirements, such as real-time computing (RTC) systems, the computations need to be completed considering the required timing constraint, so it is more difficult to trade off between computational latency and power consumption. In this paper, we propose a low-power CNN accelerator for edge inference of RTC systems, where the computations are operated in a column-wise manner, to realize an immediate computation for the currently available input data. We observe that most computations of some CNN kernels in deep layers can be completed in multiple cycles, while not affecting the overall computational latency. Thus, we present a multi-cycle scheme to conduct the column-wise convolutional operations to reduce the hardware resource and power consumption. We present hardware architecture for the multi-cycle scheme as a domain-specific CNN architecture, which is then implemented in a 65 nm technology. We prove our proposed approach realizes up to 8.45%, 49.41% and 50.64% power reductions for LeNet, AlexNet and VGG16, respectively. The experimental results show that our approach tends to cause a larger power reduction for the CNN models with greater depth, larger kernels and more channels.
format	Online Article Text
id	pubmed-9965634
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-99656342023-02-26 A Low-Power Hardware Architecture for Real-Time CNN Computing Liu, Xinyu Cao, Chenhong Duan, Shengyu Sensors (Basel) Article Convolutional neural network (CNN) is widely deployed on edge devices, performing tasks such as objective detection, image recognition and acoustic recognition. However, the limited resources and strict power constraints of edge devices pose a great challenge to applying the computationally intensive CNN models. In addition, for the edge applications with real-time requirements, such as real-time computing (RTC) systems, the computations need to be completed considering the required timing constraint, so it is more difficult to trade off between computational latency and power consumption. In this paper, we propose a low-power CNN accelerator for edge inference of RTC systems, where the computations are operated in a column-wise manner, to realize an immediate computation for the currently available input data. We observe that most computations of some CNN kernels in deep layers can be completed in multiple cycles, while not affecting the overall computational latency. Thus, we present a multi-cycle scheme to conduct the column-wise convolutional operations to reduce the hardware resource and power consumption. We present hardware architecture for the multi-cycle scheme as a domain-specific CNN architecture, which is then implemented in a 65 nm technology. We prove our proposed approach realizes up to 8.45%, 49.41% and 50.64% power reductions for LeNet, AlexNet and VGG16, respectively. The experimental results show that our approach tends to cause a larger power reduction for the CNN models with greater depth, larger kernels and more channels. MDPI 2023-02-11 /pmc/articles/PMC9965634/ /pubmed/36850642 http://dx.doi.org/10.3390/s23042045 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Liu, Xinyu Cao, Chenhong Duan, Shengyu A Low-Power Hardware Architecture for Real-Time CNN Computing
title	A Low-Power Hardware Architecture for Real-Time CNN Computing
title_full	A Low-Power Hardware Architecture for Real-Time CNN Computing
title_fullStr	A Low-Power Hardware Architecture for Real-Time CNN Computing
title_full_unstemmed	A Low-Power Hardware Architecture for Real-Time CNN Computing
title_short	A Low-Power Hardware Architecture for Real-Time CNN Computing
title_sort	low-power hardware architecture for real-time cnn computing
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9965634/ https://www.ncbi.nlm.nih.gov/pubmed/36850642 http://dx.doi.org/10.3390/s23042045
work_keys_str_mv	AT liuxinyu alowpowerhardwarearchitectureforrealtimecnncomputing AT caochenhong alowpowerhardwarearchitectureforrealtimecnncomputing AT duanshengyu alowpowerhardwarearchitectureforrealtimecnncomputing AT liuxinyu lowpowerhardwarearchitectureforrealtimecnncomputing AT caochenhong lowpowerhardwarearchitectureforrealtimecnncomputing AT duanshengyu lowpowerhardwarearchitectureforrealtimecnncomputing

A Low-Power Hardware Architecture for Real-Time CNN Computing

Ejemplares similares