Cargando…

Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml

Recurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus have been widely adopted. Their use in low-latency environments has, however, been limited as a result of the difficulties of implementing recurrent architectures on field-programm...

Descripción completa

Detalles Bibliográficos
Autores principales:	Khoda, Elham E., Rankin, Dylan, de Lima, Rafael Teixeira, Harris, Philip, Hauck, Scott, Hsu, Shih-Chieh, Kagan, Michael, Loncar, Vladimir, Paikara, Chaitanya, Rao, Richa, Summers, Sioni, Vernieri, Caterina, Wang, Aaron
Lenguaje:	eng
Publicado:	2022
Materias:	stat.ML Mathematical Physics and Mathematics physics.ins-det Detectors and Experimental Techniques hep-ex Particle Physics - Experiment cs.LG Computing and Computers
Acceso en línea:	https://dx.doi.org/10.1088/2632-2153/acc0d7 http://cds.cern.ch/record/2816114

_version_	1780973561982222336
author	Khoda, Elham E. Rankin, Dylan de Lima, Rafael Teixeira Harris, Philip Hauck, Scott Hsu, Shih-Chieh Kagan, Michael Loncar, Vladimir Paikara, Chaitanya Rao, Richa Summers, Sioni Vernieri, Caterina Wang, Aaron
author_facet	Khoda, Elham E. Rankin, Dylan de Lima, Rafael Teixeira Harris, Philip Hauck, Scott Hsu, Shih-Chieh Kagan, Michael Loncar, Vladimir Paikara, Chaitanya Rao, Richa Summers, Sioni Vernieri, Caterina Wang, Aaron
author_sort	Khoda, Elham E.
collection	CERN
description	Recurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus have been widely adopted. Their use in low-latency environments has, however, been limited as a result of the difficulties of implementing recurrent architectures on field-programmable gate arrays (FPGAs). In this paper we present an implementation of two types of recurrent neural network layers—long short-term memory and gated recurrent unit—within the hls4ml framework. We demonstrate that our implementation is capable of producing effective designs for both small and large models, and can be customized to meet specific design requirements for inference latencies and FPGA resources. We show the performance and synthesized designs for multiple neural networks, many of which are trained specifically for jet identification tasks at the CERN Large Hadron Collider.
id	cern-2816114
institution	Organización Europea para la Investigación Nuclear
language	eng
publishDate	2022
record_format	invenio
spelling	cern-28161142023-04-18T12:53:00Zdoi:10.1088/2632-2153/acc0d7http://cds.cern.ch/record/2816114engKhoda, Elham E.Rankin, Dylande Lima, Rafael TeixeiraHarris, PhilipHauck, ScottHsu, Shih-ChiehKagan, MichaelLoncar, VladimirPaikara, ChaitanyaRao, RichaSummers, SioniVernieri, CaterinaWang, AaronUltra-low latency recurrent neural network inference on FPGAs for physics applications with hls4mlstat.MLMathematical Physics and Mathematicsphysics.ins-detDetectors and Experimental Techniqueshep-exParticle Physics - Experimentcs.LGComputing and ComputersRecurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus have been widely adopted. Their use in low-latency environments has, however, been limited as a result of the difficulties of implementing recurrent architectures on field-programmable gate arrays (FPGAs). In this paper we present an implementation of two types of recurrent neural network layers—long short-term memory and gated recurrent unit—within the hls4ml framework. We demonstrate that our implementation is capable of producing effective designs for both small and large models, and can be customized to meet specific design requirements for inference latencies and FPGA resources. We show the performance and synthesized designs for multiple neural networks, many of which are trained specifically for jet identification tasks at the CERN Large Hadron Collider.Recurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus have been widely adopted. Their use in low-latency environments has, however, been limited as a result of the difficulties of implementing recurrent architectures on field-programmable gate arrays (FPGAs). In this paper we present an implementation of two types of recurrent neural network layers -- long short-term memory and gated recurrent unit -- within the hls4ml framework. We demonstrate that our implementation is capable of producing effective designs for both small and large models, and can be customized to meet specific design requirements for inference latencies and FPGA resources. We show the performance and synthesized designs for multiple neural networks, many of which are trained specifically for jet identification tasks at the CERN Large Hadron Collider.arXiv:2207.00559oai:cds.cern.ch:28161142022-07-01
spellingShingle	stat.ML Mathematical Physics and Mathematics physics.ins-det Detectors and Experimental Techniques hep-ex Particle Physics - Experiment cs.LG Computing and Computers Khoda, Elham E. Rankin, Dylan de Lima, Rafael Teixeira Harris, Philip Hauck, Scott Hsu, Shih-Chieh Kagan, Michael Loncar, Vladimir Paikara, Chaitanya Rao, Richa Summers, Sioni Vernieri, Caterina Wang, Aaron Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml
title	Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml
title_full	Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml
title_fullStr	Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml
title_full_unstemmed	Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml
title_short	Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml
title_sort	ultra-low latency recurrent neural network inference on fpgas for physics applications with hls4ml
topic	stat.ML Mathematical Physics and Mathematics physics.ins-det Detectors and Experimental Techniques hep-ex Particle Physics - Experiment cs.LG Computing and Computers
url	https://dx.doi.org/10.1088/2632-2153/acc0d7 http://cds.cern.ch/record/2816114
work_keys_str_mv	AT khodaelhame ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT rankindylan ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT delimarafaelteixeira ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT harrisphilip ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT hauckscott ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT hsushihchieh ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT kaganmichael ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT loncarvladimir ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT paikarachaitanya ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT raoricha ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT summerssioni ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT verniericaterina ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml AT wangaaron ultralowlatencyrecurrentneuralnetworkinferenceonfpgasforphysicsapplicationswithhls4ml

Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml

Ejemplares similares