Cargando…

Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing

Internet of things (IoT) infrastructure, fast access to knowledge becomes critical. In some application domains, such as robotics, autonomous driving, predictive maintenance, and anomaly detection, the response time of the system is more critical to ensure Quality of Service than the quality of the...

Descripción completa

Detalles Bibliográficos
Autores principales: Wielgosz, Maciej, Karwatowski, Michał
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6651173/
https://www.ncbi.nlm.nih.gov/pubmed/31284516
http://dx.doi.org/10.3390/s19132981
_version_ 1783438284197724160
author Wielgosz, Maciej
Karwatowski, Michał
author_facet Wielgosz, Maciej
Karwatowski, Michał
author_sort Wielgosz, Maciej
collection PubMed
description Internet of things (IoT) infrastructure, fast access to knowledge becomes critical. In some application domains, such as robotics, autonomous driving, predictive maintenance, and anomaly detection, the response time of the system is more critical to ensure Quality of Service than the quality of the answer. In this paper, we propose a methodology, a set of predefined steps to be taken in order to map the models to hardware, especially field programmable gate arrays (FPGAs), with the main focus on latency reduction. Multi-objective covariance matrix adaptation evolution strategy (MO-CMA-ES) was employed along with custom scores for sparsity, bit-width of the representation and quality of the model. Furthermore, we created a framework which enables mapping of neural models to FPGAs. The proposed solution is validated using three case studies and Xilinx Zynq UltraScale+ MPSoC 285 XCZU15EG as a platform. The results show a compression ratio for quantization and pruning in different scenarios with and without retraining procedures. Using our publicly available framework, we achieved 210 ns of latency for a single processing step for a model composed of two long short-term memory (LSTM) and a single dense layer.
format Online
Article
Text
id pubmed-6651173
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-66511732019-08-07 Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing Wielgosz, Maciej Karwatowski, Michał Sensors (Basel) Article Internet of things (IoT) infrastructure, fast access to knowledge becomes critical. In some application domains, such as robotics, autonomous driving, predictive maintenance, and anomaly detection, the response time of the system is more critical to ensure Quality of Service than the quality of the answer. In this paper, we propose a methodology, a set of predefined steps to be taken in order to map the models to hardware, especially field programmable gate arrays (FPGAs), with the main focus on latency reduction. Multi-objective covariance matrix adaptation evolution strategy (MO-CMA-ES) was employed along with custom scores for sparsity, bit-width of the representation and quality of the model. Furthermore, we created a framework which enables mapping of neural models to FPGAs. The proposed solution is validated using three case studies and Xilinx Zynq UltraScale+ MPSoC 285 XCZU15EG as a platform. The results show a compression ratio for quantization and pruning in different scenarios with and without retraining procedures. Using our publicly available framework, we achieved 210 ns of latency for a single processing step for a model composed of two long short-term memory (LSTM) and a single dense layer. MDPI 2019-07-05 /pmc/articles/PMC6651173/ /pubmed/31284516 http://dx.doi.org/10.3390/s19132981 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wielgosz, Maciej
Karwatowski, Michał
Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing
title Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing
title_full Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing
title_fullStr Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing
title_full_unstemmed Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing
title_short Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing
title_sort mapping neural networks to fpga-based iot devices for ultra-low latency processing
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6651173/
https://www.ncbi.nlm.nih.gov/pubmed/31284516
http://dx.doi.org/10.3390/s19132981
work_keys_str_mv AT wielgoszmaciej mappingneuralnetworkstofpgabasediotdevicesforultralowlatencyprocessing
AT karwatowskimichał mappingneuralnetworkstofpgabasediotdevicesforultralowlatencyprocessing