Cargando…
A Heterogeneous Hardware Accelerator for Image Classification in Embedded Systems
Convolutional neural networks (CNN) have been extensively employed for image classification due to their high accuracy. However, inference is a computationally-intensive process that often requires hardware acceleration to operate in real time. For mobile devices, the power consumption of graphics p...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8069940/ https://www.ncbi.nlm.nih.gov/pubmed/33918668 http://dx.doi.org/10.3390/s21082637 |
_version_ | 1783683354818772992 |
---|---|
author | Pérez, Ignacio Figueroa, Miguel |
author_facet | Pérez, Ignacio Figueroa, Miguel |
author_sort | Pérez, Ignacio |
collection | PubMed |
description | Convolutional neural networks (CNN) have been extensively employed for image classification due to their high accuracy. However, inference is a computationally-intensive process that often requires hardware acceleration to operate in real time. For mobile devices, the power consumption of graphics processors (GPUs) is frequently prohibitive, and field-programmable gate arrays (FPGA) become a solution to perform inference at high speed. Although previous works have implemented CNN inference on FPGAs, their high utilization of on-chip memory and arithmetic resources complicate their application on resource-constrained edge devices. In this paper, we present a scalable, low power, low resource-utilization accelerator architecture for inference on the MobileNet V2 CNN. The architecture uses a heterogeneous system with an embedded processor as the main controller, external memory to store network data, and dedicated hardware implemented on reconfigurable logic with a scalable number of processing elements (PE). Implemented on a XCZU7EV FPGA running at 200 MHz and using four PEs, the accelerator infers with 87% top-5 accuracy and processes an image of [Formula: see text] pixels in 220 ms. It consumes 7.35 W of power and uses less than 30% of the logic and arithmetic resources used by other MobileNet FPGA accelerators. |
format | Online Article Text |
id | pubmed-8069940 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-80699402021-04-26 A Heterogeneous Hardware Accelerator for Image Classification in Embedded Systems Pérez, Ignacio Figueroa, Miguel Sensors (Basel) Article Convolutional neural networks (CNN) have been extensively employed for image classification due to their high accuracy. However, inference is a computationally-intensive process that often requires hardware acceleration to operate in real time. For mobile devices, the power consumption of graphics processors (GPUs) is frequently prohibitive, and field-programmable gate arrays (FPGA) become a solution to perform inference at high speed. Although previous works have implemented CNN inference on FPGAs, their high utilization of on-chip memory and arithmetic resources complicate their application on resource-constrained edge devices. In this paper, we present a scalable, low power, low resource-utilization accelerator architecture for inference on the MobileNet V2 CNN. The architecture uses a heterogeneous system with an embedded processor as the main controller, external memory to store network data, and dedicated hardware implemented on reconfigurable logic with a scalable number of processing elements (PE). Implemented on a XCZU7EV FPGA running at 200 MHz and using four PEs, the accelerator infers with 87% top-5 accuracy and processes an image of [Formula: see text] pixels in 220 ms. It consumes 7.35 W of power and uses less than 30% of the logic and arithmetic resources used by other MobileNet FPGA accelerators. MDPI 2021-04-09 /pmc/articles/PMC8069940/ /pubmed/33918668 http://dx.doi.org/10.3390/s21082637 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Pérez, Ignacio Figueroa, Miguel A Heterogeneous Hardware Accelerator for Image Classification in Embedded Systems |
title | A Heterogeneous Hardware Accelerator for Image Classification in Embedded Systems |
title_full | A Heterogeneous Hardware Accelerator for Image Classification in Embedded Systems |
title_fullStr | A Heterogeneous Hardware Accelerator for Image Classification in Embedded Systems |
title_full_unstemmed | A Heterogeneous Hardware Accelerator for Image Classification in Embedded Systems |
title_short | A Heterogeneous Hardware Accelerator for Image Classification in Embedded Systems |
title_sort | heterogeneous hardware accelerator for image classification in embedded systems |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8069940/ https://www.ncbi.nlm.nih.gov/pubmed/33918668 http://dx.doi.org/10.3390/s21082637 |
work_keys_str_mv | AT perezignacio aheterogeneoushardwareacceleratorforimageclassificationinembeddedsystems AT figueroamiguel aheterogeneoushardwareacceleratorforimageclassificationinembeddedsystems AT perezignacio heterogeneoushardwareacceleratorforimageclassificationinembeddedsystems AT figueroamiguel heterogeneoushardwareacceleratorforimageclassificationinembeddedsystems |