Cargando…

Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing

Internet of things (IoT) infrastructure, fast access to knowledge becomes critical. In some application domains, such as robotics, autonomous driving, predictive maintenance, and anomaly detection, the response time of the system is more critical to ensure Quality of Service than the quality of the...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wielgosz, Maciej, Karwatowski, Michał
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2019
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6651173/ https://www.ncbi.nlm.nih.gov/pubmed/31284516 http://dx.doi.org/10.3390/s19132981

_version_	1783438284197724160
author	Wielgosz, Maciej Karwatowski, Michał
author_facet	Wielgosz, Maciej Karwatowski, Michał
author_sort	Wielgosz, Maciej
collection	PubMed
description	Internet of things (IoT) infrastructure, fast access to knowledge becomes critical. In some application domains, such as robotics, autonomous driving, predictive maintenance, and anomaly detection, the response time of the system is more critical to ensure Quality of Service than the quality of the answer. In this paper, we propose a methodology, a set of predefined steps to be taken in order to map the models to hardware, especially field programmable gate arrays (FPGAs), with the main focus on latency reduction. Multi-objective covariance matrix adaptation evolution strategy (MO-CMA-ES) was employed along with custom scores for sparsity, bit-width of the representation and quality of the model. Furthermore, we created a framework which enables mapping of neural models to FPGAs. The proposed solution is validated using three case studies and Xilinx Zynq UltraScale+ MPSoC 285 XCZU15EG as a platform. The results show a compression ratio for quantization and pruning in different scenarios with and without retraining procedures. Using our publicly available framework, we achieved 210 ns of latency for a single processing step for a model composed of two long short-term memory (LSTM) and a single dense layer.
format	Online Article Text
id	pubmed-6651173
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-66511732019-08-07 Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing Wielgosz, Maciej Karwatowski, Michał Sensors (Basel) Article Internet of things (IoT) infrastructure, fast access to knowledge becomes critical. In some application domains, such as robotics, autonomous driving, predictive maintenance, and anomaly detection, the response time of the system is more critical to ensure Quality of Service than the quality of the answer. In this paper, we propose a methodology, a set of predefined steps to be taken in order to map the models to hardware, especially field programmable gate arrays (FPGAs), with the main focus on latency reduction. Multi-objective covariance matrix adaptation evolution strategy (MO-CMA-ES) was employed along with custom scores for sparsity, bit-width of the representation and quality of the model. Furthermore, we created a framework which enables mapping of neural models to FPGAs. The proposed solution is validated using three case studies and Xilinx Zynq UltraScale+ MPSoC 285 XCZU15EG as a platform. The results show a compression ratio for quantization and pruning in different scenarios with and without retraining procedures. Using our publicly available framework, we achieved 210 ns of latency for a single processing step for a model composed of two long short-term memory (LSTM) and a single dense layer. MDPI 2019-07-05 /pmc/articles/PMC6651173/ /pubmed/31284516 http://dx.doi.org/10.3390/s19132981 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Wielgosz, Maciej Karwatowski, Michał Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing
title	Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing
title_full	Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing
title_fullStr	Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing
title_full_unstemmed	Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing
title_short	Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing
title_sort	mapping neural networks to fpga-based iot devices for ultra-low latency processing
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6651173/ https://www.ncbi.nlm.nih.gov/pubmed/31284516 http://dx.doi.org/10.3390/s19132981
work_keys_str_mv	AT wielgoszmaciej mappingneuralnetworkstofpgabasediotdevicesforultralowlatencyprocessing AT karwatowskimichał mappingneuralnetworkstofpgabasediotdevicesforultralowlatencyprocessing

Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing

Ejemplares similares