Cargando…
Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors
Although the quest for more accurate solutions is pushing deep learning research towards larger and more complex algorithms, edge devices demand efficient inference and therefore reduction in model size, latency and energy consumption. One technique to limit model size is quantization, which implies...
Autores principales: | Coelho, Claudionor N., Kuusela, Aki, Li, Shan, Zhuang, Hao, Aarrestad, Thea, Loncar, Vladimir, Ngadiuba, Jennifer, Pierini, Maurizio, Pol, Adrian Alan, Summers, Sioni |
---|---|
Lenguaje: | eng |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1038/s42256-021-00356-5 http://cds.cern.ch/record/2724942 |
Ejemplares similares
-
Convolutional LSTM models to estimate network traffic
por: Waczynska, Joanna, et al.
Publicado: (2021) -
Compressing deep neural networks on FPGAs to binary and ternary precision with HLS4ML
por: Loncar, Vladimir, et al.
Publicado: (2021) -
Scanning Test System Prototype of p/sFEB for the ATLAS Phase-I sTGC Trigger Upgrade
por: Wang, Xinxin, et al.
Publicado: (2018) -
Towards Optimal Compression: Joint Pruning and Quantization
por: Zandonati, Ben, et al.
Publicado: (2023) -
Accelerating Recurrent Neural Networks for Gravitational Wave Experiments
por: Que, Zhiqiang, et al.
Publicado: (2021)