Cargando…
Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing
Internet of things (IoT) infrastructure, fast access to knowledge becomes critical. In some application domains, such as robotics, autonomous driving, predictive maintenance, and anomaly detection, the response time of the system is more critical to ensure Quality of Service than the quality of the...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6651173/ https://www.ncbi.nlm.nih.gov/pubmed/31284516 http://dx.doi.org/10.3390/s19132981 |
_version_ | 1783438284197724160 |
---|---|
author | Wielgosz, Maciej Karwatowski, Michał |
author_facet | Wielgosz, Maciej Karwatowski, Michał |
author_sort | Wielgosz, Maciej |
collection | PubMed |
description | Internet of things (IoT) infrastructure, fast access to knowledge becomes critical. In some application domains, such as robotics, autonomous driving, predictive maintenance, and anomaly detection, the response time of the system is more critical to ensure Quality of Service than the quality of the answer. In this paper, we propose a methodology, a set of predefined steps to be taken in order to map the models to hardware, especially field programmable gate arrays (FPGAs), with the main focus on latency reduction. Multi-objective covariance matrix adaptation evolution strategy (MO-CMA-ES) was employed along with custom scores for sparsity, bit-width of the representation and quality of the model. Furthermore, we created a framework which enables mapping of neural models to FPGAs. The proposed solution is validated using three case studies and Xilinx Zynq UltraScale+ MPSoC 285 XCZU15EG as a platform. The results show a compression ratio for quantization and pruning in different scenarios with and without retraining procedures. Using our publicly available framework, we achieved 210 ns of latency for a single processing step for a model composed of two long short-term memory (LSTM) and a single dense layer. |
format | Online Article Text |
id | pubmed-6651173 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-66511732019-08-07 Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing Wielgosz, Maciej Karwatowski, Michał Sensors (Basel) Article Internet of things (IoT) infrastructure, fast access to knowledge becomes critical. In some application domains, such as robotics, autonomous driving, predictive maintenance, and anomaly detection, the response time of the system is more critical to ensure Quality of Service than the quality of the answer. In this paper, we propose a methodology, a set of predefined steps to be taken in order to map the models to hardware, especially field programmable gate arrays (FPGAs), with the main focus on latency reduction. Multi-objective covariance matrix adaptation evolution strategy (MO-CMA-ES) was employed along with custom scores for sparsity, bit-width of the representation and quality of the model. Furthermore, we created a framework which enables mapping of neural models to FPGAs. The proposed solution is validated using three case studies and Xilinx Zynq UltraScale+ MPSoC 285 XCZU15EG as a platform. The results show a compression ratio for quantization and pruning in different scenarios with and without retraining procedures. Using our publicly available framework, we achieved 210 ns of latency for a single processing step for a model composed of two long short-term memory (LSTM) and a single dense layer. MDPI 2019-07-05 /pmc/articles/PMC6651173/ /pubmed/31284516 http://dx.doi.org/10.3390/s19132981 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Wielgosz, Maciej Karwatowski, Michał Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing |
title | Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing |
title_full | Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing |
title_fullStr | Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing |
title_full_unstemmed | Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing |
title_short | Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing |
title_sort | mapping neural networks to fpga-based iot devices for ultra-low latency processing |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6651173/ https://www.ncbi.nlm.nih.gov/pubmed/31284516 http://dx.doi.org/10.3390/s19132981 |
work_keys_str_mv | AT wielgoszmaciej mappingneuralnetworkstofpgabasediotdevicesforultralowlatencyprocessing AT karwatowskimichał mappingneuralnetworkstofpgabasediotdevicesforultralowlatencyprocessing |