Cargando…
SensiMix: Sensitivity-Aware 8-bit index & 1-bit value mixed precision quantization for BERT compression
Given a pre-trained BERT, how can we compress it to a fast and lightweight one while maintaining its accuracy? Pre-training language model, such as BERT, is effective for improving the performance of natural language processing (NLP) tasks. However, heavy models like BERT have problems of large memo...
Autores principales: | Piao, Tairen, Cho, Ikhyun, Kang, U. |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9015158/ https://www.ncbi.nlm.nih.gov/pubmed/35436295 http://dx.doi.org/10.1371/journal.pone.0265621 |
Ejemplares similares
-
GradFreeBits: Gradient-Free Bit Allocation for Mixed-Precision Neural Networks
por: Bodner, Benjamin Jacob, et al.
Publicado: (2022) -
A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks
por: Long, Xin, et al.
Publicado: (2020) -
Single Abrikosov vortices as quantized information bits
por: Golod, T., et al.
Publicado: (2015) -
Optimization of the Sampling Periods and the Quantization Bit Lengths for Networked Estimation
por: Suh, Young Soo, et al.
Publicado: (2010) -
Gaussian Multiple Access Channels with One-Bit Quantizer at the Receiver †(,)‡
por: Rassouli, Borzoo, et al.
Publicado: (2018)