Cargando…

No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks

A common technique for compressing a neural network is to compute the k-rank [Formula: see text] approximation [Formula: see text] of the matrix [Formula: see text] via SVD that corresponds to a fully connected layer (or embedding layer). Here, d is the number of input neurons in the layer, n is the...

Descripción completa

Detalles Bibliográficos
Autores principales: Tukan, Murad, Maalouf, Alaa, Weksler, Matan, Feldman, Dan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8402276/
https://www.ncbi.nlm.nih.gov/pubmed/34451040
http://dx.doi.org/10.3390/s21165599
_version_ 1783745751061364736
author Tukan, Murad
Maalouf, Alaa
Weksler, Matan
Feldman, Dan
author_facet Tukan, Murad
Maalouf, Alaa
Weksler, Matan
Feldman, Dan
author_sort Tukan, Murad
collection PubMed
description A common technique for compressing a neural network is to compute the k-rank [Formula: see text] approximation [Formula: see text] of the matrix [Formula: see text] via SVD that corresponds to a fully connected layer (or embedding layer). Here, d is the number of input neurons in the layer, n is the number in the next one, and [Formula: see text] is stored in [Formula: see text] memory instead of [Formula: see text]. Then, a fine-tuning step is used to improve this initial compression. However, end users may not have the required computation resources, time, or budget to run this fine-tuning stage. Furthermore, the original training set may not be available. In this paper, we provide an algorithm for compressing neural networks using a similar initial compression time (to common techniques) but without the fine-tuning step. The main idea is replacing the k-rank [Formula: see text] approximation with [Formula: see text] , for [Formula: see text] , which is known to be less sensitive to outliers but much harder to compute. Our main technical result is a practical and provable approximation algorithm to compute it for any [Formula: see text] , based on modern techniques in computational geometry. Extensive experimental results on the GLUE benchmark for compressing the networks BERT, DistilBERT, XLNet, and RoBERTa confirm this theoretical advantage.
format Online
Article
Text
id pubmed-8402276
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-84022762021-08-29 No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks Tukan, Murad Maalouf, Alaa Weksler, Matan Feldman, Dan Sensors (Basel) Article A common technique for compressing a neural network is to compute the k-rank [Formula: see text] approximation [Formula: see text] of the matrix [Formula: see text] via SVD that corresponds to a fully connected layer (or embedding layer). Here, d is the number of input neurons in the layer, n is the number in the next one, and [Formula: see text] is stored in [Formula: see text] memory instead of [Formula: see text]. Then, a fine-tuning step is used to improve this initial compression. However, end users may not have the required computation resources, time, or budget to run this fine-tuning stage. Furthermore, the original training set may not be available. In this paper, we provide an algorithm for compressing neural networks using a similar initial compression time (to common techniques) but without the fine-tuning step. The main idea is replacing the k-rank [Formula: see text] approximation with [Formula: see text] , for [Formula: see text] , which is known to be less sensitive to outliers but much harder to compute. Our main technical result is a practical and provable approximation algorithm to compute it for any [Formula: see text] , based on modern techniques in computational geometry. Extensive experimental results on the GLUE benchmark for compressing the networks BERT, DistilBERT, XLNet, and RoBERTa confirm this theoretical advantage. MDPI 2021-08-19 /pmc/articles/PMC8402276/ /pubmed/34451040 http://dx.doi.org/10.3390/s21165599 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Tukan, Murad
Maalouf, Alaa
Weksler, Matan
Feldman, Dan
No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks
title No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks
title_full No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks
title_fullStr No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks
title_full_unstemmed No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks
title_short No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks
title_sort no fine-tuning, no cry: robust svd for compressing deep networks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8402276/
https://www.ncbi.nlm.nih.gov/pubmed/34451040
http://dx.doi.org/10.3390/s21165599
work_keys_str_mv AT tukanmurad nofinetuningnocryrobustsvdforcompressingdeepnetworks
AT maaloufalaa nofinetuningnocryrobustsvdforcompressingdeepnetworks
AT wekslermatan nofinetuningnocryrobustsvdforcompressingdeepnetworks
AT feldmandan nofinetuningnocryrobustsvdforcompressingdeepnetworks