Cargando…

Optimized Inference Engine Generation for Advanced Deep Learning Models

With the prospect of ever-increasing luminosity in the Large Hadron Collider (LHC) and particle collider experiments in general, there is a growing demand for efficient data processing and analysis tools in both online and offline settings. Among others, the HLS4ML framework has demonstrated that de...

Descripción completa

Detalles Bibliográficos
Autor principal: Abrahamse, Robin
Lenguaje:eng
Publicado: 2021
Materias:
Acceso en línea:http://cds.cern.ch/record/2788556
_version_ 1780972135907328000
author Abrahamse, Robin
author_facet Abrahamse, Robin
author_sort Abrahamse, Robin
collection CERN
description With the prospect of ever-increasing luminosity in the Large Hadron Collider (LHC) and particle collider experiments in general, there is a growing demand for efficient data processing and analysis tools in both online and offline settings. Among others, the HLS4ML framework has demonstrated that deep learning inference can be used effectively for efficiently executing data analysis tasks in High Energy Physics (HEP). The current project builds and improves on recent efforts to generate efficient inference engines for deep learning models with hardware-specific optimizations using the Intel oneDNN library and HLS4ML framework. A functioning HLS4ML backend was built for translating common deep learning model formats (Tensorflow, Keras, Pytorch, and ONNX) to oneDNN-accelerated inference engines, with support for advanced non-sequential models. The generated inference engines show inference latencies and peak memory footprints improving on or meeting state-of the-art tools for the renowned ResNet50V2 and MobilenetV2 models. Moreover, this report includes a brief analysis of system feasibility and potential future work, the latter of which is already partly in progress.
id cern-2788556
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2021
record_format invenio
spelling cern-27885562021-10-25T19:24:18Zhttp://cds.cern.ch/record/2788556engAbrahamse, RobinOptimized Inference Engine Generation for Advanced Deep Learning ModelsEngineeringComputing and ComputersWith the prospect of ever-increasing luminosity in the Large Hadron Collider (LHC) and particle collider experiments in general, there is a growing demand for efficient data processing and analysis tools in both online and offline settings. Among others, the HLS4ML framework has demonstrated that deep learning inference can be used effectively for efficiently executing data analysis tasks in High Energy Physics (HEP). The current project builds and improves on recent efforts to generate efficient inference engines for deep learning models with hardware-specific optimizations using the Intel oneDNN library and HLS4ML framework. A functioning HLS4ML backend was built for translating common deep learning model formats (Tensorflow, Keras, Pytorch, and ONNX) to oneDNN-accelerated inference engines, with support for advanced non-sequential models. The generated inference engines show inference latencies and peak memory footprints improving on or meeting state-of the-art tools for the renowned ResNet50V2 and MobilenetV2 models. Moreover, this report includes a brief analysis of system feasibility and potential future work, the latter of which is already partly in progress.CERN-STUDENTS-Note-2021-220oai:cds.cern.ch:27885562021-10-25
spellingShingle Engineering
Computing and Computers
Abrahamse, Robin
Optimized Inference Engine Generation for Advanced Deep Learning Models
title Optimized Inference Engine Generation for Advanced Deep Learning Models
title_full Optimized Inference Engine Generation for Advanced Deep Learning Models
title_fullStr Optimized Inference Engine Generation for Advanced Deep Learning Models
title_full_unstemmed Optimized Inference Engine Generation for Advanced Deep Learning Models
title_short Optimized Inference Engine Generation for Advanced Deep Learning Models
title_sort optimized inference engine generation for advanced deep learning models
topic Engineering
Computing and Computers
url http://cds.cern.ch/record/2788556
work_keys_str_mv AT abrahamserobin optimizedinferenceenginegenerationforadvanceddeeplearningmodels