Cargando…

On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches

It is well-known that a neural network learning process—along with its connections to fitting, compression, and generalization—is not yet well understood. In this paper, we propose a novel approach to capturing such neural network dynamics using information-bottleneck-type techniques, involving the...

Descripción completa

Detalles Bibliográficos
Autores principales:	Lyu, Zhaoyan, Aminian, Gholamali, Rodrigues, Miguel R. D.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10377965/ https://www.ncbi.nlm.nih.gov/pubmed/37510010 http://dx.doi.org/10.3390/e25071063

_version_	1785079648505298944
author	Lyu, Zhaoyan Aminian, Gholamali Rodrigues, Miguel R. D.
author_facet	Lyu, Zhaoyan Aminian, Gholamali Rodrigues, Miguel R. D.
author_sort	Lyu, Zhaoyan
collection	PubMed
description	It is well-known that a neural network learning process—along with its connections to fitting, compression, and generalization—is not yet well understood. In this paper, we propose a novel approach to capturing such neural network dynamics using information-bottleneck-type techniques, involving the replacement of mutual information measures (which are notoriously difficult to estimate in high-dimensional spaces) by other more tractable ones, including (1) the minimum mean-squared error associated with the reconstruction of the network input data from some intermediate network representation and (2) the cross-entropy associated with a certain class label given some network representation. We then conducted an empirical study in order to ascertain how different network models, network learning algorithms, and datasets may affect the learning dynamics. Our experiments show that our proposed approach appears to be more reliable in comparison with classical information bottleneck ones in capturing network dynamics during both the training and testing phases. Our experiments also reveal that the fitting and compression phases exist regardless of the choice of activation function. Additionally, our findings suggest that model architectures, training algorithms, and datasets that lead to better generalization tend to exhibit more pronounced fitting and compression phases.
format	Online Article Text
id	pubmed-10377965
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-103779652023-07-29 On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches Lyu, Zhaoyan Aminian, Gholamali Rodrigues, Miguel R. D. Entropy (Basel) Article It is well-known that a neural network learning process—along with its connections to fitting, compression, and generalization—is not yet well understood. In this paper, we propose a novel approach to capturing such neural network dynamics using information-bottleneck-type techniques, involving the replacement of mutual information measures (which are notoriously difficult to estimate in high-dimensional spaces) by other more tractable ones, including (1) the minimum mean-squared error associated with the reconstruction of the network input data from some intermediate network representation and (2) the cross-entropy associated with a certain class label given some network representation. We then conducted an empirical study in order to ascertain how different network models, network learning algorithms, and datasets may affect the learning dynamics. Our experiments show that our proposed approach appears to be more reliable in comparison with classical information bottleneck ones in capturing network dynamics during both the training and testing phases. Our experiments also reveal that the fitting and compression phases exist regardless of the choice of activation function. Additionally, our findings suggest that model architectures, training algorithms, and datasets that lead to better generalization tend to exhibit more pronounced fitting and compression phases. MDPI 2023-07-14 /pmc/articles/PMC10377965/ /pubmed/37510010 http://dx.doi.org/10.3390/e25071063 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Lyu, Zhaoyan Aminian, Gholamali Rodrigues, Miguel R. D. On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches
title	On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches
title_full	On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches
title_fullStr	On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches
title_full_unstemmed	On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches
title_short	On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches
title_sort	on neural networks fitting, compression, and generalization behavior via information-bottleneck-like approaches
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10377965/ https://www.ncbi.nlm.nih.gov/pubmed/37510010 http://dx.doi.org/10.3390/e25071063
work_keys_str_mv	AT lyuzhaoyan onneuralnetworksfittingcompressionandgeneralizationbehaviorviainformationbottlenecklikeapproaches AT aminiangholamali onneuralnetworksfittingcompressionandgeneralizationbehaviorviainformationbottlenecklikeapproaches AT rodriguesmiguelrd onneuralnetworksfittingcompressionandgeneralizationbehaviorviainformationbottlenecklikeapproaches

On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches

Ejemplares similares