Cargando…

PET: Parameter-efficient Knowledge Distillation on Transformer

Given a large Transformer model, how can we obtain a small and computationally efficient model which maintains the performance of the original model? Transformer has shown significant performance improvements for many NLP tasks in recent years. However, their large size, expensive computational cost...

Descripción completa

Detalles Bibliográficos
Autores principales:	Jeon, Hyojin, Park, Seungcheol, Kim, Jin-Gee, Kang, U.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2023
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10325108/ https://www.ncbi.nlm.nih.gov/pubmed/37410716 http://dx.doi.org/10.1371/journal.pone.0288060

Internet

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10325108/
https://www.ncbi.nlm.nih.gov/pubmed/37410716
http://dx.doi.org/10.1371/journal.pone.0288060

PET: Parameter-efficient Knowledge Distillation on Transformer

Internet

Ejemplares similares