Cargando…
No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks
A common technique for compressing a neural network is to compute the k-rank [Formula: see text] approximation [Formula: see text] of the matrix [Formula: see text] via SVD that corresponds to a fully connected layer (or embedding layer). Here, d is the number of input neurons in the layer, n is the...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8402276/ https://www.ncbi.nlm.nih.gov/pubmed/34451040 http://dx.doi.org/10.3390/s21165599 |
_version_ | 1783745751061364736 |
---|---|
author | Tukan, Murad Maalouf, Alaa Weksler, Matan Feldman, Dan |
author_facet | Tukan, Murad Maalouf, Alaa Weksler, Matan Feldman, Dan |
author_sort | Tukan, Murad |
collection | PubMed |
description | A common technique for compressing a neural network is to compute the k-rank [Formula: see text] approximation [Formula: see text] of the matrix [Formula: see text] via SVD that corresponds to a fully connected layer (or embedding layer). Here, d is the number of input neurons in the layer, n is the number in the next one, and [Formula: see text] is stored in [Formula: see text] memory instead of [Formula: see text]. Then, a fine-tuning step is used to improve this initial compression. However, end users may not have the required computation resources, time, or budget to run this fine-tuning stage. Furthermore, the original training set may not be available. In this paper, we provide an algorithm for compressing neural networks using a similar initial compression time (to common techniques) but without the fine-tuning step. The main idea is replacing the k-rank [Formula: see text] approximation with [Formula: see text] , for [Formula: see text] , which is known to be less sensitive to outliers but much harder to compute. Our main technical result is a practical and provable approximation algorithm to compute it for any [Formula: see text] , based on modern techniques in computational geometry. Extensive experimental results on the GLUE benchmark for compressing the networks BERT, DistilBERT, XLNet, and RoBERTa confirm this theoretical advantage. |
format | Online Article Text |
id | pubmed-8402276 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-84022762021-08-29 No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks Tukan, Murad Maalouf, Alaa Weksler, Matan Feldman, Dan Sensors (Basel) Article A common technique for compressing a neural network is to compute the k-rank [Formula: see text] approximation [Formula: see text] of the matrix [Formula: see text] via SVD that corresponds to a fully connected layer (or embedding layer). Here, d is the number of input neurons in the layer, n is the number in the next one, and [Formula: see text] is stored in [Formula: see text] memory instead of [Formula: see text]. Then, a fine-tuning step is used to improve this initial compression. However, end users may not have the required computation resources, time, or budget to run this fine-tuning stage. Furthermore, the original training set may not be available. In this paper, we provide an algorithm for compressing neural networks using a similar initial compression time (to common techniques) but without the fine-tuning step. The main idea is replacing the k-rank [Formula: see text] approximation with [Formula: see text] , for [Formula: see text] , which is known to be less sensitive to outliers but much harder to compute. Our main technical result is a practical and provable approximation algorithm to compute it for any [Formula: see text] , based on modern techniques in computational geometry. Extensive experimental results on the GLUE benchmark for compressing the networks BERT, DistilBERT, XLNet, and RoBERTa confirm this theoretical advantage. MDPI 2021-08-19 /pmc/articles/PMC8402276/ /pubmed/34451040 http://dx.doi.org/10.3390/s21165599 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Tukan, Murad Maalouf, Alaa Weksler, Matan Feldman, Dan No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks |
title | No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks |
title_full | No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks |
title_fullStr | No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks |
title_full_unstemmed | No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks |
title_short | No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks |
title_sort | no fine-tuning, no cry: robust svd for compressing deep networks |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8402276/ https://www.ncbi.nlm.nih.gov/pubmed/34451040 http://dx.doi.org/10.3390/s21165599 |
work_keys_str_mv | AT tukanmurad nofinetuningnocryrobustsvdforcompressingdeepnetworks AT maaloufalaa nofinetuningnocryrobustsvdforcompressingdeepnetworks AT wekslermatan nofinetuningnocryrobustsvdforcompressingdeepnetworks AT feldmandan nofinetuningnocryrobustsvdforcompressingdeepnetworks |