Cargando…

Studying the Evolution of Neural Activation Patterns During Training of Feed-Forward ReLU Networks

The ability of deep neural networks to form powerful emergent representations of complex statistical patterns in data is as remarkable as imperfectly understood. For deep ReLU networks, these are encoded in the mixed discrete–continuous structure of linear weight matrices and non-linear binary activ...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hartmann, David, Franzen, Daniel, Brodehl, Sebastian
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2021
Materias:	Artificial Intelligence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8733739/ https://www.ncbi.nlm.nih.gov/pubmed/35005614 http://dx.doi.org/10.3389/frai.2021.642374

_version_	1784627866072252416
author	Hartmann, David Franzen, Daniel Brodehl, Sebastian
author_facet	Hartmann, David Franzen, Daniel Brodehl, Sebastian
author_sort	Hartmann, David
collection	PubMed
description	The ability of deep neural networks to form powerful emergent representations of complex statistical patterns in data is as remarkable as imperfectly understood. For deep ReLU networks, these are encoded in the mixed discrete–continuous structure of linear weight matrices and non-linear binary activations. Our article develops a new technique for instrumenting such networks to efficiently record activation statistics, such as information content (entropy) and similarity of patterns, in real-world training runs. We then study the evolution of activation patterns during training for networks of different architecture using different training and initialization strategies. As a result, we see characteristic- and general-related as well as architecture-related behavioral patterns: in particular, most architectures form bottom-up structure, with the exception of highly tuned state-of-the-art architectures and methods (PyramidNet and FixUp), where layers appear to converge more simultaneously. We also observe intermediate dips in entropy in conventional CNNs that are not visible in residual networks. A reference implementation is provided under a free license.
format	Online Article Text
id	pubmed-8733739
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-87337392022-01-07 Studying the Evolution of Neural Activation Patterns During Training of Feed-Forward ReLU Networks Hartmann, David Franzen, Daniel Brodehl, Sebastian Front Artif Intell Artificial Intelligence The ability of deep neural networks to form powerful emergent representations of complex statistical patterns in data is as remarkable as imperfectly understood. For deep ReLU networks, these are encoded in the mixed discrete–continuous structure of linear weight matrices and non-linear binary activations. Our article develops a new technique for instrumenting such networks to efficiently record activation statistics, such as information content (entropy) and similarity of patterns, in real-world training runs. We then study the evolution of activation patterns during training for networks of different architecture using different training and initialization strategies. As a result, we see characteristic- and general-related as well as architecture-related behavioral patterns: in particular, most architectures form bottom-up structure, with the exception of highly tuned state-of-the-art architectures and methods (PyramidNet and FixUp), where layers appear to converge more simultaneously. We also observe intermediate dips in entropy in conventional CNNs that are not visible in residual networks. A reference implementation is provided under a free license. Frontiers Media S.A. 2021-12-23 /pmc/articles/PMC8733739/ /pubmed/35005614 http://dx.doi.org/10.3389/frai.2021.642374 Text en Copyright © 2021 Hartmann, Franzen and Brodehl. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Artificial Intelligence Hartmann, David Franzen, Daniel Brodehl, Sebastian Studying the Evolution of Neural Activation Patterns During Training of Feed-Forward ReLU Networks
title	Studying the Evolution of Neural Activation Patterns During Training of Feed-Forward ReLU Networks
title_full	Studying the Evolution of Neural Activation Patterns During Training of Feed-Forward ReLU Networks
title_fullStr	Studying the Evolution of Neural Activation Patterns During Training of Feed-Forward ReLU Networks
title_full_unstemmed	Studying the Evolution of Neural Activation Patterns During Training of Feed-Forward ReLU Networks
title_short	Studying the Evolution of Neural Activation Patterns During Training of Feed-Forward ReLU Networks
title_sort	studying the evolution of neural activation patterns during training of feed-forward relu networks
topic	Artificial Intelligence
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8733739/ https://www.ncbi.nlm.nih.gov/pubmed/35005614 http://dx.doi.org/10.3389/frai.2021.642374
work_keys_str_mv	AT hartmanndavid studyingtheevolutionofneuralactivationpatternsduringtrainingoffeedforwardrelunetworks AT franzendaniel studyingtheevolutionofneuralactivationpatternsduringtrainingoffeedforwardrelunetworks AT brodehlsebastian studyingtheevolutionofneuralactivationpatternsduringtrainingoffeedforwardrelunetworks

Studying the Evolution of Neural Activation Patterns During Training of Feed-Forward ReLU Networks

Ejemplares similares