Cargando…

Generalized Machine Learning Quantization Implementation for High Level Synthesis Targeting FPGAs

The Large Hadron Collider produces a large amount of data while operating, approximately one petabyte of data per second. The collider is currently undergoing an upgrade to collide more particles and produce even more data. In order to handle this large quantity of data, high throughput and low late...

Descripción completa

Detalles Bibliográficos
Autor principal: Trahms, Matthew Karl
Lenguaje:eng
Publicado: 2022
Materias:
Acceso en línea:http://cds.cern.ch/record/2804953
Descripción
Sumario:The Large Hadron Collider produces a large amount of data while operating, approximately one petabyte of data per second. The collider is currently undergoing an upgrade to collide more particles and produce even more data. In order to handle this large quantity of data, high throughput and low latency algorithms are required to filter interesting collision results out of the rest of the data collected by the sensors attached to the collider. Machine learning algorithms can be used for this filtering task with comparable accuracy to the traditional filtering algorithms and provide a wide range of accelerator options. FINN and hls4ml are frameworks to deploy machine learning models on Field Programmable Gate Arrays for high throughput, low latency acceleration options. FINN utilizes Brevitas, a quantization aware training library. Using Brevitas, I trained a particle tracking network and demonstrated equivalent accuracy at lower bit precision than post training quantization. As a cross organizational project, the hls4ml and FINN teams collaborated to develop the QONNX standard for quantized machine learning model representation. In order to integrate QONNX into hls4ml, I implemented new transformations to support the unique structures of QONNX.