Cargando…

No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks

A common technique for compressing a neural network is to compute the k-rank [Formula: see text] approximation [Formula: see text] of the matrix [Formula: see text] via SVD that corresponds to a fully connected layer (or embedding layer). Here, d is the number of input neurons in the layer, n is the...

Descripción completa

Detalles Bibliográficos
Autores principales:	Tukan, Murad, Maalouf, Alaa, Weksler, Matan, Feldman, Dan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8402276/ https://www.ncbi.nlm.nih.gov/pubmed/34451040 http://dx.doi.org/10.3390/s21165599

_version_	1783745751061364736
author	Tukan, Murad Maalouf, Alaa Weksler, Matan Feldman, Dan
author_facet	Tukan, Murad Maalouf, Alaa Weksler, Matan Feldman, Dan
author_sort	Tukan, Murad
collection	PubMed
description	A common technique for compressing a neural network is to compute the k-rank [Formula: see text] approximation [Formula: see text] of the matrix [Formula: see text] via SVD that corresponds to a fully connected layer (or embedding layer). Here, d is the number of input neurons in the layer, n is the number in the next one, and [Formula: see text] is stored in [Formula: see text] memory instead of [Formula: see text]. Then, a fine-tuning step is used to improve this initial compression. However, end users may not have the required computation resources, time, or budget to run this fine-tuning stage. Furthermore, the original training set may not be available. In this paper, we provide an algorithm for compressing neural networks using a similar initial compression time (to common techniques) but without the fine-tuning step. The main idea is replacing the k-rank [Formula: see text] approximation with [Formula: see text] , for [Formula: see text] , which is known to be less sensitive to outliers but much harder to compute. Our main technical result is a practical and provable approximation algorithm to compute it for any [Formula: see text] , based on modern techniques in computational geometry. Extensive experimental results on the GLUE benchmark for compressing the networks BERT, DistilBERT, XLNet, and RoBERTa confirm this theoretical advantage.
format	Online Article Text
id	pubmed-8402276
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-84022762021-08-29 No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks Tukan, Murad Maalouf, Alaa Weksler, Matan Feldman, Dan Sensors (Basel) Article A common technique for compressing a neural network is to compute the k-rank [Formula: see text] approximation [Formula: see text] of the matrix [Formula: see text] via SVD that corresponds to a fully connected layer (or embedding layer). Here, d is the number of input neurons in the layer, n is the number in the next one, and [Formula: see text] is stored in [Formula: see text] memory instead of [Formula: see text]. Then, a fine-tuning step is used to improve this initial compression. However, end users may not have the required computation resources, time, or budget to run this fine-tuning stage. Furthermore, the original training set may not be available. In this paper, we provide an algorithm for compressing neural networks using a similar initial compression time (to common techniques) but without the fine-tuning step. The main idea is replacing the k-rank [Formula: see text] approximation with [Formula: see text] , for [Formula: see text] , which is known to be less sensitive to outliers but much harder to compute. Our main technical result is a practical and provable approximation algorithm to compute it for any [Formula: see text] , based on modern techniques in computational geometry. Extensive experimental results on the GLUE benchmark for compressing the networks BERT, DistilBERT, XLNet, and RoBERTa confirm this theoretical advantage. MDPI 2021-08-19 /pmc/articles/PMC8402276/ /pubmed/34451040 http://dx.doi.org/10.3390/s21165599 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Tukan, Murad Maalouf, Alaa Weksler, Matan Feldman, Dan No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks
title	No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks
title_full	No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks
title_fullStr	No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks
title_full_unstemmed	No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks
title_short	No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks
title_sort	no fine-tuning, no cry: robust svd for compressing deep networks
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8402276/ https://www.ncbi.nlm.nih.gov/pubmed/34451040 http://dx.doi.org/10.3390/s21165599
work_keys_str_mv	AT tukanmurad nofinetuningnocryrobustsvdforcompressingdeepnetworks AT maaloufalaa nofinetuningnocryrobustsvdforcompressingdeepnetworks AT wekslermatan nofinetuningnocryrobustsvdforcompressingdeepnetworks AT feldmandan nofinetuningnocryrobustsvdforcompressingdeepnetworks

No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks

Ejemplares similares