Cargando…
Population Risk Improvement with Model Compression: An Information-Theoretic Approach †
It has been reported in many recent works on deep model compression that the population risk of a compressed model can be even better than that of the original model. In this paper, an information-theoretic explanation for this population risk improvement phenomenon is provided by jointly studying t...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8534708/ https://www.ncbi.nlm.nih.gov/pubmed/34681979 http://dx.doi.org/10.3390/e23101255 |
_version_ | 1784587609652068352 |
---|---|
author | Bu, Yuheng Gao, Weihao Zou, Shaofeng Veeravalli, Venugopal V. |
author_facet | Bu, Yuheng Gao, Weihao Zou, Shaofeng Veeravalli, Venugopal V. |
author_sort | Bu, Yuheng |
collection | PubMed |
description | It has been reported in many recent works on deep model compression that the population risk of a compressed model can be even better than that of the original model. In this paper, an information-theoretic explanation for this population risk improvement phenomenon is provided by jointly studying the decrease in the generalization error and the increase in the empirical risk that results from model compression. It is first shown that model compression reduces an information-theoretic bound on the generalization error, which suggests that model compression can be interpreted as a regularization technique to avoid overfitting. The increase in empirical risk caused by model compression is then characterized using rate distortion theory. These results imply that the overall population risk could be improved by model compression if the decrease in generalization error exceeds the increase in empirical risk. A linear regression example is presented to demonstrate that such a decrease in population risk due to model compression is indeed possible. Our theoretical results further suggest a way to improve a widely used model compression algorithm, i.e., Hessian-weighted K-means clustering, by regularizing the distance between the clustering centers. Experiments with neural networks are provided to validate our theoretical assertions. |
format | Online Article Text |
id | pubmed-8534708 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-85347082021-10-23 Population Risk Improvement with Model Compression: An Information-Theoretic Approach † Bu, Yuheng Gao, Weihao Zou, Shaofeng Veeravalli, Venugopal V. Entropy (Basel) Article It has been reported in many recent works on deep model compression that the population risk of a compressed model can be even better than that of the original model. In this paper, an information-theoretic explanation for this population risk improvement phenomenon is provided by jointly studying the decrease in the generalization error and the increase in the empirical risk that results from model compression. It is first shown that model compression reduces an information-theoretic bound on the generalization error, which suggests that model compression can be interpreted as a regularization technique to avoid overfitting. The increase in empirical risk caused by model compression is then characterized using rate distortion theory. These results imply that the overall population risk could be improved by model compression if the decrease in generalization error exceeds the increase in empirical risk. A linear regression example is presented to demonstrate that such a decrease in population risk due to model compression is indeed possible. Our theoretical results further suggest a way to improve a widely used model compression algorithm, i.e., Hessian-weighted K-means clustering, by regularizing the distance between the clustering centers. Experiments with neural networks are provided to validate our theoretical assertions. MDPI 2021-09-27 /pmc/articles/PMC8534708/ /pubmed/34681979 http://dx.doi.org/10.3390/e23101255 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Bu, Yuheng Gao, Weihao Zou, Shaofeng Veeravalli, Venugopal V. Population Risk Improvement with Model Compression: An Information-Theoretic Approach † |
title | Population Risk Improvement with Model Compression: An Information-Theoretic Approach † |
title_full | Population Risk Improvement with Model Compression: An Information-Theoretic Approach † |
title_fullStr | Population Risk Improvement with Model Compression: An Information-Theoretic Approach † |
title_full_unstemmed | Population Risk Improvement with Model Compression: An Information-Theoretic Approach † |
title_short | Population Risk Improvement with Model Compression: An Information-Theoretic Approach † |
title_sort | population risk improvement with model compression: an information-theoretic approach † |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8534708/ https://www.ncbi.nlm.nih.gov/pubmed/34681979 http://dx.doi.org/10.3390/e23101255 |
work_keys_str_mv | AT buyuheng populationriskimprovementwithmodelcompressionaninformationtheoreticapproach AT gaoweihao populationriskimprovementwithmodelcompressionaninformationtheoreticapproach AT zoushaofeng populationriskimprovementwithmodelcompressionaninformationtheoreticapproach AT veeravallivenugopalv populationriskimprovementwithmodelcompressionaninformationtheoreticapproach |