Cargando…

Optimized Inference Engine Generation for Advanced Deep Learning Models

With the prospect of ever-increasing luminosity in the Large Hadron Collider (LHC) and particle collider experiments in general, there is a growing demand for efficient data processing and analysis tools in both online and offline settings. Among others, the HLS4ML framework has demonstrated that de...

Descripción completa

Detalles Bibliográficos
Autor principal: Abrahamse, Robin
Lenguaje:eng
Publicado: 2021
Materias:
Acceso en línea:http://cds.cern.ch/record/2788556
Descripción
Sumario:With the prospect of ever-increasing luminosity in the Large Hadron Collider (LHC) and particle collider experiments in general, there is a growing demand for efficient data processing and analysis tools in both online and offline settings. Among others, the HLS4ML framework has demonstrated that deep learning inference can be used effectively for efficiently executing data analysis tasks in High Energy Physics (HEP). The current project builds and improves on recent efforts to generate efficient inference engines for deep learning models with hardware-specific optimizations using the Intel oneDNN library and HLS4ML framework. A functioning HLS4ML backend was built for translating common deep learning model formats (Tensorflow, Keras, Pytorch, and ONNX) to oneDNN-accelerated inference engines, with support for advanced non-sequential models. The generated inference engines show inference latencies and peak memory footprints improving on or meeting state-of the-art tools for the renowned ResNet50V2 and MobilenetV2 models. Moreover, this report includes a brief analysis of system feasibility and potential future work, the latter of which is already partly in progress.