Cargando…
Pea-KD: Parameter-efficient and accurate Knowledge Distillation on BERT
Knowledge Distillation (KD) is one of the widely known methods for model compression. In essence, KD trains a smaller student model based on a larger teacher model and tries to retain the teacher model’s level of performance as much as possible. However, existing KD methods suffer from the following...
Autores principales: | Cho, Ikhyun, Kang, U |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8856529/ https://www.ncbi.nlm.nih.gov/pubmed/35180258 http://dx.doi.org/10.1371/journal.pone.0263592 |
Ejemplares similares
-
PET: Parameter-efficient Knowledge Distillation on Transformer
por: Jeon, Hyojin, et al.
Publicado: (2023) -
SensiMix: Sensitivity-Aware 8-bit index & 1-bit value mixed precision quantization for BERT compression
por: Piao, Tairen, et al.
Publicado: (2022) -
Towards Transfer Learning Techniques—BERT, DistilBERT, BERTimbau, and DistilBERTimbau for Automatic Text Classification from Different Languages: A Case Study
por: Silva Barbon, Rafael, et al.
Publicado: (2022) -
LAD: Layer-Wise Adaptive Distillation for BERT Model Compression
por: Lin, Ying-Jia, et al.
Publicado: (2023) -
KD_ConvNeXt: knowledge distillation-based image classification of lung tumor surgical specimen sections
por: Zheng, Zhaoliang, et al.
Publicado: (2023)