Cargando…

Population Risk Improvement with Model Compression: An Information-Theoretic Approach †

It has been reported in many recent works on deep model compression that the population risk of a compressed model can be even better than that of the original model. In this paper, an information-theoretic explanation for this population risk improvement phenomenon is provided by jointly studying t...

Descripción completa

Detalles Bibliográficos
Autores principales:	Bu, Yuheng, Gao, Weihao, Zou, Shaofeng, Veeravalli, Venugopal V.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8534708/ https://www.ncbi.nlm.nih.gov/pubmed/34681979 http://dx.doi.org/10.3390/e23101255

_version_	1784587609652068352
author	Bu, Yuheng Gao, Weihao Zou, Shaofeng Veeravalli, Venugopal V.
author_facet	Bu, Yuheng Gao, Weihao Zou, Shaofeng Veeravalli, Venugopal V.
author_sort	Bu, Yuheng
collection	PubMed
description	It has been reported in many recent works on deep model compression that the population risk of a compressed model can be even better than that of the original model. In this paper, an information-theoretic explanation for this population risk improvement phenomenon is provided by jointly studying the decrease in the generalization error and the increase in the empirical risk that results from model compression. It is first shown that model compression reduces an information-theoretic bound on the generalization error, which suggests that model compression can be interpreted as a regularization technique to avoid overfitting. The increase in empirical risk caused by model compression is then characterized using rate distortion theory. These results imply that the overall population risk could be improved by model compression if the decrease in generalization error exceeds the increase in empirical risk. A linear regression example is presented to demonstrate that such a decrease in population risk due to model compression is indeed possible. Our theoretical results further suggest a way to improve a widely used model compression algorithm, i.e., Hessian-weighted K-means clustering, by regularizing the distance between the clustering centers. Experiments with neural networks are provided to validate our theoretical assertions.
format	Online Article Text
id	pubmed-8534708
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-85347082021-10-23 Population Risk Improvement with Model Compression: An Information-Theoretic Approach † Bu, Yuheng Gao, Weihao Zou, Shaofeng Veeravalli, Venugopal V. Entropy (Basel) Article It has been reported in many recent works on deep model compression that the population risk of a compressed model can be even better than that of the original model. In this paper, an information-theoretic explanation for this population risk improvement phenomenon is provided by jointly studying the decrease in the generalization error and the increase in the empirical risk that results from model compression. It is first shown that model compression reduces an information-theoretic bound on the generalization error, which suggests that model compression can be interpreted as a regularization technique to avoid overfitting. The increase in empirical risk caused by model compression is then characterized using rate distortion theory. These results imply that the overall population risk could be improved by model compression if the decrease in generalization error exceeds the increase in empirical risk. A linear regression example is presented to demonstrate that such a decrease in population risk due to model compression is indeed possible. Our theoretical results further suggest a way to improve a widely used model compression algorithm, i.e., Hessian-weighted K-means clustering, by regularizing the distance between the clustering centers. Experiments with neural networks are provided to validate our theoretical assertions. MDPI 2021-09-27 /pmc/articles/PMC8534708/ /pubmed/34681979 http://dx.doi.org/10.3390/e23101255 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Bu, Yuheng Gao, Weihao Zou, Shaofeng Veeravalli, Venugopal V. Population Risk Improvement with Model Compression: An Information-Theoretic Approach †
title	Population Risk Improvement with Model Compression: An Information-Theoretic Approach †
title_full	Population Risk Improvement with Model Compression: An Information-Theoretic Approach †
title_fullStr	Population Risk Improvement with Model Compression: An Information-Theoretic Approach †
title_full_unstemmed	Population Risk Improvement with Model Compression: An Information-Theoretic Approach †
title_short	Population Risk Improvement with Model Compression: An Information-Theoretic Approach †
title_sort	population risk improvement with model compression: an information-theoretic approach †
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8534708/ https://www.ncbi.nlm.nih.gov/pubmed/34681979 http://dx.doi.org/10.3390/e23101255
work_keys_str_mv	AT buyuheng populationriskimprovementwithmodelcompressionaninformationtheoreticapproach AT gaoweihao populationriskimprovementwithmodelcompressionaninformationtheoreticapproach AT zoushaofeng populationriskimprovementwithmodelcompressionaninformationtheoreticapproach AT veeravallivenugopalv populationriskimprovementwithmodelcompressionaninformationtheoreticapproach

Population Risk Improvement with Model Compression: An Information-Theoretic Approach †

Ejemplares similares