Cargando…

On the Difference between the Information Bottleneck and the Deep Information Bottleneck

Combining the information bottleneck model with deep learning by replacing mutual information terms with deep neural nets has proven successful in areas ranging from generative modelling to interpreting deep neural networks. In this paper, we revisit the deep variational information bottleneck and t...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wieczorek, Aleksander, Roth, Volker
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7516540/ https://www.ncbi.nlm.nih.gov/pubmed/33285906 http://dx.doi.org/10.3390/e22020131

_version_	1783587025346101248
author	Wieczorek, Aleksander Roth, Volker
author_facet	Wieczorek, Aleksander Roth, Volker
author_sort	Wieczorek, Aleksander
collection	PubMed
description	Combining the information bottleneck model with deep learning by replacing mutual information terms with deep neural nets has proven successful in areas ranging from generative modelling to interpreting deep neural networks. In this paper, we revisit the deep variational information bottleneck and the assumptions needed for its derivation. The two assumed properties of the data, X and Y, and their latent representation T, take the form of two Markov chains [Formula: see text] and [Formula: see text]. Requiring both to hold during the optimisation process can be limiting for the set of potential joint distributions [Formula: see text]. We, therefore, show how to circumvent this limitation by optimising a lower bound for the mutual information between T and Y: [Formula: see text] , for which only the latter Markov chain has to be satisfied. The mutual information [Formula: see text] can be split into two non-negative parts. The first part is the lower bound for [Formula: see text] , which is optimised in deep variational information bottleneck (DVIB) and cognate models in practice. The second part consists of two terms that measure how much the former requirement [Formula: see text] is violated. Finally, we propose interpreting the family of information bottleneck models as directed graphical models, and show that in this framework, the original and deep information bottlenecks are special cases of a fundamental IB model.
format	Online Article Text
id	pubmed-7516540
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-75165402020-11-09 On the Difference between the Information Bottleneck and the Deep Information Bottleneck Wieczorek, Aleksander Roth, Volker Entropy (Basel) Article Combining the information bottleneck model with deep learning by replacing mutual information terms with deep neural nets has proven successful in areas ranging from generative modelling to interpreting deep neural networks. In this paper, we revisit the deep variational information bottleneck and the assumptions needed for its derivation. The two assumed properties of the data, X and Y, and their latent representation T, take the form of two Markov chains [Formula: see text] and [Formula: see text]. Requiring both to hold during the optimisation process can be limiting for the set of potential joint distributions [Formula: see text]. We, therefore, show how to circumvent this limitation by optimising a lower bound for the mutual information between T and Y: [Formula: see text] , for which only the latter Markov chain has to be satisfied. The mutual information [Formula: see text] can be split into two non-negative parts. The first part is the lower bound for [Formula: see text] , which is optimised in deep variational information bottleneck (DVIB) and cognate models in practice. The second part consists of two terms that measure how much the former requirement [Formula: see text] is violated. Finally, we propose interpreting the family of information bottleneck models as directed graphical models, and show that in this framework, the original and deep information bottlenecks are special cases of a fundamental IB model. MDPI 2020-01-22 /pmc/articles/PMC7516540/ /pubmed/33285906 http://dx.doi.org/10.3390/e22020131 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Wieczorek, Aleksander Roth, Volker On the Difference between the Information Bottleneck and the Deep Information Bottleneck
title	On the Difference between the Information Bottleneck and the Deep Information Bottleneck
title_full	On the Difference between the Information Bottleneck and the Deep Information Bottleneck
title_fullStr	On the Difference between the Information Bottleneck and the Deep Information Bottleneck
title_full_unstemmed	On the Difference between the Information Bottleneck and the Deep Information Bottleneck
title_short	On the Difference between the Information Bottleneck and the Deep Information Bottleneck
title_sort	on the difference between the information bottleneck and the deep information bottleneck
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7516540/ https://www.ncbi.nlm.nih.gov/pubmed/33285906 http://dx.doi.org/10.3390/e22020131
work_keys_str_mv	AT wieczorekaleksander onthedifferencebetweentheinformationbottleneckandthedeepinformationbottleneck AT rothvolker onthedifferencebetweentheinformationbottleneckandthedeepinformationbottleneck

On the Difference between the Information Bottleneck and the Deep Information Bottleneck

Ejemplares similares