Cargando…

Examining the Causal Structures of Deep Neural Networks Using Information Theory

Deep Neural Networks (DNNs) are often examined at the level of their response to input, such as analyzing the mutual information between nodes and data sets. Yet DNNs can also be examined at the level of causation, exploring “what does what” within the layers of the network itself. Historically, ana...

Descripción completa

Detalles Bibliográficos
Autores principales:	Marrow, Scythia, Michaud, Eric J., Hoel, Erik
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7766755/ https://www.ncbi.nlm.nih.gov/pubmed/33353094 http://dx.doi.org/10.3390/e22121429

_version_	1783628794751352832
author	Marrow, Scythia Michaud, Eric J. Hoel, Erik
author_facet	Marrow, Scythia Michaud, Eric J. Hoel, Erik
author_sort	Marrow, Scythia
collection	PubMed
description	Deep Neural Networks (DNNs) are often examined at the level of their response to input, such as analyzing the mutual information between nodes and data sets. Yet DNNs can also be examined at the level of causation, exploring “what does what” within the layers of the network itself. Historically, analyzing the causal structure of DNNs has received less attention than understanding their responses to input. Yet definitionally, generalizability must be a function of a DNN’s causal structure as it reflects how the DNN responds to unseen or even not-yet-defined future inputs. Here, we introduce a suite of metrics based on information theory to quantify and track changes in the causal structure of DNNs during training. Specifically, we introduce the effective information (EI) of a feedforward DNN, which is the mutual information between layer input and output following a maximum-entropy perturbation. The EI can be used to assess the degree of causal influence nodes and edges have over their downstream targets in each layer. We show that the EI can be further decomposed in order to examine the sensitivity of a layer (measured by how well edges transmit perturbations) and the degeneracy of a layer (measured by how edge overlap interferes with transmission), along with estimates of the amount of integrated information of a layer. Together, these properties define where each layer lies in the “causal plane”, which can be used to visualize how layer connectivity becomes more sensitive or degenerate over time, and how integration changes during training, revealing how the layer-by-layer causal structure differentiates. These results may help in understanding the generalization capabilities of DNNs and provide foundational tools for making DNNs both more generalizable and more explainable.
format	Online Article Text
id	pubmed-7766755
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-77667552021-02-24 Examining the Causal Structures of Deep Neural Networks Using Information Theory Marrow, Scythia Michaud, Eric J. Hoel, Erik Entropy (Basel) Article Deep Neural Networks (DNNs) are often examined at the level of their response to input, such as analyzing the mutual information between nodes and data sets. Yet DNNs can also be examined at the level of causation, exploring “what does what” within the layers of the network itself. Historically, analyzing the causal structure of DNNs has received less attention than understanding their responses to input. Yet definitionally, generalizability must be a function of a DNN’s causal structure as it reflects how the DNN responds to unseen or even not-yet-defined future inputs. Here, we introduce a suite of metrics based on information theory to quantify and track changes in the causal structure of DNNs during training. Specifically, we introduce the effective information (EI) of a feedforward DNN, which is the mutual information between layer input and output following a maximum-entropy perturbation. The EI can be used to assess the degree of causal influence nodes and edges have over their downstream targets in each layer. We show that the EI can be further decomposed in order to examine the sensitivity of a layer (measured by how well edges transmit perturbations) and the degeneracy of a layer (measured by how edge overlap interferes with transmission), along with estimates of the amount of integrated information of a layer. Together, these properties define where each layer lies in the “causal plane”, which can be used to visualize how layer connectivity becomes more sensitive or degenerate over time, and how integration changes during training, revealing how the layer-by-layer causal structure differentiates. These results may help in understanding the generalization capabilities of DNNs and provide foundational tools for making DNNs both more generalizable and more explainable. MDPI 2020-12-18 /pmc/articles/PMC7766755/ /pubmed/33353094 http://dx.doi.org/10.3390/e22121429 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Marrow, Scythia Michaud, Eric J. Hoel, Erik Examining the Causal Structures of Deep Neural Networks Using Information Theory
title	Examining the Causal Structures of Deep Neural Networks Using Information Theory
title_full	Examining the Causal Structures of Deep Neural Networks Using Information Theory
title_fullStr	Examining the Causal Structures of Deep Neural Networks Using Information Theory
title_full_unstemmed	Examining the Causal Structures of Deep Neural Networks Using Information Theory
title_short	Examining the Causal Structures of Deep Neural Networks Using Information Theory
title_sort	examining the causal structures of deep neural networks using information theory
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7766755/ https://www.ncbi.nlm.nih.gov/pubmed/33353094 http://dx.doi.org/10.3390/e22121429
work_keys_str_mv	AT marrowscythia examiningthecausalstructuresofdeepneuralnetworksusinginformationtheory AT michaudericj examiningthecausalstructuresofdeepneuralnetworksusinginformationtheory AT hoelerik examiningthecausalstructuresofdeepneuralnetworksusinginformationtheory

Examining the Causal Structures of Deep Neural Networks Using Information Theory

Ejemplares similares