Cargando…

A Low-Power Hardware Architecture for Real-Time CNN Computing

Convolutional neural network (CNN) is widely deployed on edge devices, performing tasks such as objective detection, image recognition and acoustic recognition. However, the limited resources and strict power constraints of edge devices pose a great challenge to applying the computationally intensiv...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Xinyu, Cao, Chenhong, Duan, Shengyu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9965634/
https://www.ncbi.nlm.nih.gov/pubmed/36850642
http://dx.doi.org/10.3390/s23042045
_version_ 1784896813966295040
author Liu, Xinyu
Cao, Chenhong
Duan, Shengyu
author_facet Liu, Xinyu
Cao, Chenhong
Duan, Shengyu
author_sort Liu, Xinyu
collection PubMed
description Convolutional neural network (CNN) is widely deployed on edge devices, performing tasks such as objective detection, image recognition and acoustic recognition. However, the limited resources and strict power constraints of edge devices pose a great challenge to applying the computationally intensive CNN models. In addition, for the edge applications with real-time requirements, such as real-time computing (RTC) systems, the computations need to be completed considering the required timing constraint, so it is more difficult to trade off between computational latency and power consumption. In this paper, we propose a low-power CNN accelerator for edge inference of RTC systems, where the computations are operated in a column-wise manner, to realize an immediate computation for the currently available input data. We observe that most computations of some CNN kernels in deep layers can be completed in multiple cycles, while not affecting the overall computational latency. Thus, we present a multi-cycle scheme to conduct the column-wise convolutional operations to reduce the hardware resource and power consumption. We present hardware architecture for the multi-cycle scheme as a domain-specific CNN architecture, which is then implemented in a 65 nm technology. We prove our proposed approach realizes up to 8.45%, 49.41% and 50.64% power reductions for LeNet, AlexNet and VGG16, respectively. The experimental results show that our approach tends to cause a larger power reduction for the CNN models with greater depth, larger kernels and more channels.
format Online
Article
Text
id pubmed-9965634
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-99656342023-02-26 A Low-Power Hardware Architecture for Real-Time CNN Computing Liu, Xinyu Cao, Chenhong Duan, Shengyu Sensors (Basel) Article Convolutional neural network (CNN) is widely deployed on edge devices, performing tasks such as objective detection, image recognition and acoustic recognition. However, the limited resources and strict power constraints of edge devices pose a great challenge to applying the computationally intensive CNN models. In addition, for the edge applications with real-time requirements, such as real-time computing (RTC) systems, the computations need to be completed considering the required timing constraint, so it is more difficult to trade off between computational latency and power consumption. In this paper, we propose a low-power CNN accelerator for edge inference of RTC systems, where the computations are operated in a column-wise manner, to realize an immediate computation for the currently available input data. We observe that most computations of some CNN kernels in deep layers can be completed in multiple cycles, while not affecting the overall computational latency. Thus, we present a multi-cycle scheme to conduct the column-wise convolutional operations to reduce the hardware resource and power consumption. We present hardware architecture for the multi-cycle scheme as a domain-specific CNN architecture, which is then implemented in a 65 nm technology. We prove our proposed approach realizes up to 8.45%, 49.41% and 50.64% power reductions for LeNet, AlexNet and VGG16, respectively. The experimental results show that our approach tends to cause a larger power reduction for the CNN models with greater depth, larger kernels and more channels. MDPI 2023-02-11 /pmc/articles/PMC9965634/ /pubmed/36850642 http://dx.doi.org/10.3390/s23042045 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Liu, Xinyu
Cao, Chenhong
Duan, Shengyu
A Low-Power Hardware Architecture for Real-Time CNN Computing
title A Low-Power Hardware Architecture for Real-Time CNN Computing
title_full A Low-Power Hardware Architecture for Real-Time CNN Computing
title_fullStr A Low-Power Hardware Architecture for Real-Time CNN Computing
title_full_unstemmed A Low-Power Hardware Architecture for Real-Time CNN Computing
title_short A Low-Power Hardware Architecture for Real-Time CNN Computing
title_sort low-power hardware architecture for real-time cnn computing
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9965634/
https://www.ncbi.nlm.nih.gov/pubmed/36850642
http://dx.doi.org/10.3390/s23042045
work_keys_str_mv AT liuxinyu alowpowerhardwarearchitectureforrealtimecnncomputing
AT caochenhong alowpowerhardwarearchitectureforrealtimecnncomputing
AT duanshengyu alowpowerhardwarearchitectureforrealtimecnncomputing
AT liuxinyu lowpowerhardwarearchitectureforrealtimecnncomputing
AT caochenhong lowpowerhardwarearchitectureforrealtimecnncomputing
AT duanshengyu lowpowerhardwarearchitectureforrealtimecnncomputing