Cargando…
SensiMix: Sensitivity-Aware 8-bit index & 1-bit value mixed precision quantization for BERT compression
Given a pre-trained BERT, how can we compress it to a fast and lightweight one while maintaining its accuracy? Pre-training language model, such as BERT, is effective for improving the performance of natural language processing (NLP) tasks. However, heavy models like BERT have problems of large memo...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9015158/ https://www.ncbi.nlm.nih.gov/pubmed/35436295 http://dx.doi.org/10.1371/journal.pone.0265621 |