Cargando…

Population Risk Improvement with Model Compression: An Information-Theoretic Approach †

It has been reported in many recent works on deep model compression that the population risk of a compressed model can be even better than that of the original model. In this paper, an information-theoretic explanation for this population risk improvement phenomenon is provided by jointly studying t...

Descripción completa

Detalles Bibliográficos
Autores principales: Bu, Yuheng, Gao, Weihao, Zou, Shaofeng, Veeravalli, Venugopal V.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8534708/
https://www.ncbi.nlm.nih.gov/pubmed/34681979
http://dx.doi.org/10.3390/e23101255
_version_ 1784587609652068352
author Bu, Yuheng
Gao, Weihao
Zou, Shaofeng
Veeravalli, Venugopal V.
author_facet Bu, Yuheng
Gao, Weihao
Zou, Shaofeng
Veeravalli, Venugopal V.
author_sort Bu, Yuheng
collection PubMed
description It has been reported in many recent works on deep model compression that the population risk of a compressed model can be even better than that of the original model. In this paper, an information-theoretic explanation for this population risk improvement phenomenon is provided by jointly studying the decrease in the generalization error and the increase in the empirical risk that results from model compression. It is first shown that model compression reduces an information-theoretic bound on the generalization error, which suggests that model compression can be interpreted as a regularization technique to avoid overfitting. The increase in empirical risk caused by model compression is then characterized using rate distortion theory. These results imply that the overall population risk could be improved by model compression if the decrease in generalization error exceeds the increase in empirical risk. A linear regression example is presented to demonstrate that such a decrease in population risk due to model compression is indeed possible. Our theoretical results further suggest a way to improve a widely used model compression algorithm, i.e., Hessian-weighted K-means clustering, by regularizing the distance between the clustering centers. Experiments with neural networks are provided to validate our theoretical assertions.
format Online
Article
Text
id pubmed-8534708
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-85347082021-10-23 Population Risk Improvement with Model Compression: An Information-Theoretic Approach † Bu, Yuheng Gao, Weihao Zou, Shaofeng Veeravalli, Venugopal V. Entropy (Basel) Article It has been reported in many recent works on deep model compression that the population risk of a compressed model can be even better than that of the original model. In this paper, an information-theoretic explanation for this population risk improvement phenomenon is provided by jointly studying the decrease in the generalization error and the increase in the empirical risk that results from model compression. It is first shown that model compression reduces an information-theoretic bound on the generalization error, which suggests that model compression can be interpreted as a regularization technique to avoid overfitting. The increase in empirical risk caused by model compression is then characterized using rate distortion theory. These results imply that the overall population risk could be improved by model compression if the decrease in generalization error exceeds the increase in empirical risk. A linear regression example is presented to demonstrate that such a decrease in population risk due to model compression is indeed possible. Our theoretical results further suggest a way to improve a widely used model compression algorithm, i.e., Hessian-weighted K-means clustering, by regularizing the distance between the clustering centers. Experiments with neural networks are provided to validate our theoretical assertions. MDPI 2021-09-27 /pmc/articles/PMC8534708/ /pubmed/34681979 http://dx.doi.org/10.3390/e23101255 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Bu, Yuheng
Gao, Weihao
Zou, Shaofeng
Veeravalli, Venugopal V.
Population Risk Improvement with Model Compression: An Information-Theoretic Approach †
title Population Risk Improvement with Model Compression: An Information-Theoretic Approach †
title_full Population Risk Improvement with Model Compression: An Information-Theoretic Approach †
title_fullStr Population Risk Improvement with Model Compression: An Information-Theoretic Approach †
title_full_unstemmed Population Risk Improvement with Model Compression: An Information-Theoretic Approach †
title_short Population Risk Improvement with Model Compression: An Information-Theoretic Approach †
title_sort population risk improvement with model compression: an information-theoretic approach †
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8534708/
https://www.ncbi.nlm.nih.gov/pubmed/34681979
http://dx.doi.org/10.3390/e23101255
work_keys_str_mv AT buyuheng populationriskimprovementwithmodelcompressionaninformationtheoreticapproach
AT gaoweihao populationriskimprovementwithmodelcompressionaninformationtheoreticapproach
AT zoushaofeng populationriskimprovementwithmodelcompressionaninformationtheoreticapproach
AT veeravallivenugopalv populationriskimprovementwithmodelcompressionaninformationtheoreticapproach