Cargando…
Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml
Recurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus have been widely adopted. Their use in low-latency environments has, however, been limited as a result of the difficulties of implementing recurrent architectures on field-programm...
Autores principales: | , , , , , , , , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2022
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1088/2632-2153/acc0d7 http://cds.cern.ch/record/2816114 |
_version_ | 1780973561982222336 |
---|---|
author | Khoda, Elham E. Rankin, Dylan de Lima, Rafael Teixeira Harris, Philip Hauck, Scott Hsu, Shih-Chieh Kagan, Michael Loncar, Vladimir Paikara, Chaitanya Rao, Richa Summers, Sioni Vernieri, Caterina Wang, Aaron |
author_facet | Khoda, Elham E. Rankin, Dylan de Lima, Rafael Teixeira Harris, Philip Hauck, Scott Hsu, Shih-Chieh Kagan, Michael Loncar, Vladimir Paikara, Chaitanya Rao, Richa Summers, Sioni Vernieri, Caterina Wang, Aaron |
author_sort | Khoda, Elham E. |
collection | CERN |
description | Recurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus have been widely adopted. Their use in low-latency environments has, however, been limited as a result of the difficulties of implementing recurrent architectures on field-programmable gate arrays (FPGAs). In this paper we present an implementation of two types of recurrent neural network layers—long short-term memory and gated recurrent unit—within the hls4ml framework. We demonstrate that our implementation is capable of producing effective designs for both small and large models, and can be customized to meet specific design requirements for inference latencies and FPGA resources. We show the performance and synthesized designs for multiple neural networks, many of which are trained specifically for jet identification tasks at the CERN Large Hadron Collider. |
id | cern-2816114 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2022 |
record_format | invenio |
spelling | cern-28161142023-04-18T12:53:00Zdoi:10.1088/2632-2153/acc0d7http://cds.cern.ch/record/2816114engKhoda, Elham E.Rankin, Dylande Lima, Rafael TeixeiraHarris, PhilipHauck, ScottHsu, Shih-ChiehKagan, MichaelLoncar, VladimirPaikara, ChaitanyaRao, RichaSummers, SioniVernieri, CaterinaWang, AaronUltra-low latency recurrent neural network inference on FPGAs for physics applications with hls4mlstat.MLMathematical Physics and Mathematicsphysics.ins-detDetectors and Experimental Techniqueshep-exParticle Physics - Experimentcs.LGComputing and ComputersRecurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus have been widely adopted. Their use in low-latency environments has, however, been limited as a result of the difficulties of implementing recurrent architectures on field-programmable gate arrays (FPGAs). In this paper we present an implementation of two types of recurrent neural network layers—long short-term memory and gated recurrent unit—within the hls4ml framework. We demonstrate that our implementation is capable of producing effective designs for both small and large models, and can be customized to meet specific design requirements for inference latencies and FPGA resources. We show the performance and synthesized designs for multiple neural networks, many of which are trained specifically for jet identification tasks at the CERN Large Hadron Collider.Recurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus have been widely adopted. Their use in low-latency environments has, however, been limited as a result of the difficulties of implementing recurrent architectures on field-programmable gate arrays (FPGAs). In this paper we present an implementation of two types of recurrent neural network layers -- long short-term memory and gated recurrent unit -- within the hls4ml framework. We demonstrate that our implementation is capable of producing effective designs for both small and large models, and can be customized to meet specific design requirements for inference latencies and FPGA resources. We show the performance and synthesized designs for multiple neural networks, many of which are trained specifically for jet identification tasks at the CERN Large Hadron Collider.arXiv:2207.00559oai:cds.cern.ch:28161142022-07-01 |
spellingShingle | stat.ML Mathematical Physics and Mathematics physics.ins-det Detectors and Experimental Techniques hep-ex Particle Physics - Experiment cs.LG Computing and Computers Khoda, Elham E. Rankin, Dylan de Lima, Rafael Teixeira Harris, Philip Hauck, Scott Hsu, Shih-Chieh Kagan, Michael Loncar, Vladimir Paikara, Chaitanya Rao, Richa Summers, Sioni Vernieri, Caterina Wang, Aaron Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml |
title | Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml |
title_full | Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml |
title_fullStr | Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml |
title_full_unstemmed | Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml |
title_short | Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml |
title_sort | ultra-low latency recurrent neural network inference on fpgas for physics applications with hls4ml |
topic | stat.ML Mathematical Physics and Mathematics physics.ins-det Detectors and Experimental Techniques hep-ex Particle Physics - Experiment cs.LG Computing and Computers |
url | https://dx.doi.org/10.1088/2632-2153/acc0d7 http://cds.cern.ch/record/2816114 |
work_keys_str_mv | AT khodaelhame ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT rankindylan ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT delimarafaelteixeira ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT harrisphilip ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT hauckscott ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT hsushihchieh ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT kaganmichael ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT loncarvladimir ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT paikarachaitanya ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT raoricha ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT summerssioni ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT verniericaterina ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT wangaaron ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml |