Cargando…
Fast convolutional neural networks on FPGAs with hls4ml
We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on field-programmable gate arrays (FPGAs). By extending the hls4ml library, we demonstrate an inference latency of 5 µs using convolutional architectures, targeting microsecond la...
Autores principales: | Aarrestad, Thea, Loncar, Vladimir, Ghielmetti, Nicolò, Pierini, Maurizio, Summers, Sioni, Ngadiuba, Jennifer, Petersson, Christoffer, Linander, Hampus, Iiyama, Yutaro, Di Guglielmo, Giuseppe, Duarte, Javier, Harris, Philip, Rankin, Dylan, Jindariani, Sergo, Pedro, Kevin, Tran, Nhan, Liu, Mia, Kreinar, Edward, Wu, Zhenbin, Hoang, Duc |
---|---|
Lenguaje: | eng |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1088/2632-2153/ac0ea1 http://cds.cern.ch/record/2751704 |
Ejemplares similares
-
Product Jacobi-Theta Boltzmann machines with score matching
por: Pasquale, Andrea, et al.
Publicado: (2023) -
End-to-end Sinkhorn Autoencoder with Noise Generator
por: Deja, Kamil, et al.
Publicado: (2020) -
Compressing deep neural networks on FPGAs to binary and ternary precision with HLS4ML
por: Loncar, Vladimir, et al.
Publicado: (2021) -
QONNX: Representing Arbitrary-Precision Quantized Neural Networks
por: Pappalardo, Alessandro, et al.
Publicado: (2022) -
Real-time semantic segmentation on FPGAs for autonomous vehicles with hls4ml
por: Ghielmetti, Nicolò, et al.
Publicado: (2022)