Cargando…

The Poincaré-Shannon Machine: Statistical Physics and Machine Learning Aspects of Information Cohomology

Previous works established that entropy is characterized uniquely as the first cohomology class in a topos and described some of its applications to the unsupervised classification of gene expression modules or cell types. These studies raised important questions regarding the statistical meaning of...

Descripción completa

Detalles Bibliográficos
Autor principal: Baudot, Pierre
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7515411/
http://dx.doi.org/10.3390/e21090881
_version_ 1783586811645263872
author Baudot, Pierre
author_facet Baudot, Pierre
author_sort Baudot, Pierre
collection PubMed
description Previous works established that entropy is characterized uniquely as the first cohomology class in a topos and described some of its applications to the unsupervised classification of gene expression modules or cell types. These studies raised important questions regarding the statistical meaning of the resulting cohomology of information and its interpretation or consequences with respect to usual data analysis and statistical physics. This paper aims to present the computational methods of information cohomology and to propose its interpretations in terms of statistical physics and machine learning. In order to further underline the cohomological nature of information functions and chain rules, the computation of the cohomology in low degrees is detailed to show more directly that the k multivariate mutual information ([Formula: see text]) are [Formula: see text]-coboundaries. The [Formula: see text]-cocycles condition corresponds to [Formula: see text] , which generalizes statistical independence to arbitrary degree k. Hence, the cohomology can be interpreted as quantifying the statistical dependences and the obstruction to factorization. I develop the computationally tractable subcase of simplicial information cohomology represented by entropy [Formula: see text] and information [Formula: see text] landscapes and their respective paths, allowing investigation of Shannon’s information in the multivariate case without the assumptions of independence or of identically distributed variables. I give an interpretation of this cohomology in terms of phase transitions in a model of k-body interactions, holding both for statistical physics without mean field approximations and for data points. The [Formula: see text] components define a self-internal energy functional [Formula: see text] and [Formula: see text] components define the contribution to a free energy functional [Formula: see text] (the total correlation) of the k-body interactions. A basic mean field model is developed and computed on genetic data reproducing usual free energy landscapes with phase transition, sustaining the analogy of clustering with condensation. The set of information paths in simplicial structures is in bijection with the symmetric group and random processes, providing a trivial topological expression of the second law of thermodynamics. The local minima of free energy, related to conditional information negativity and conditional independence, characterize a minimum free energy complex. This complex formalizes the minimum free-energy principle in topology, provides a definition of a complex system and characterizes a multiplicity of local minima that quantifies the diversity observed in biology. I give an interpretation of this complex in terms of unsupervised deep learning where the neural network architecture is given by the chain complex and conclude by discussing future supervised applications.
format Online
Article
Text
id pubmed-7515411
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75154112020-11-09 The Poincaré-Shannon Machine: Statistical Physics and Machine Learning Aspects of Information Cohomology Baudot, Pierre Entropy (Basel) Article Previous works established that entropy is characterized uniquely as the first cohomology class in a topos and described some of its applications to the unsupervised classification of gene expression modules or cell types. These studies raised important questions regarding the statistical meaning of the resulting cohomology of information and its interpretation or consequences with respect to usual data analysis and statistical physics. This paper aims to present the computational methods of information cohomology and to propose its interpretations in terms of statistical physics and machine learning. In order to further underline the cohomological nature of information functions and chain rules, the computation of the cohomology in low degrees is detailed to show more directly that the k multivariate mutual information ([Formula: see text]) are [Formula: see text]-coboundaries. The [Formula: see text]-cocycles condition corresponds to [Formula: see text] , which generalizes statistical independence to arbitrary degree k. Hence, the cohomology can be interpreted as quantifying the statistical dependences and the obstruction to factorization. I develop the computationally tractable subcase of simplicial information cohomology represented by entropy [Formula: see text] and information [Formula: see text] landscapes and their respective paths, allowing investigation of Shannon’s information in the multivariate case without the assumptions of independence or of identically distributed variables. I give an interpretation of this cohomology in terms of phase transitions in a model of k-body interactions, holding both for statistical physics without mean field approximations and for data points. The [Formula: see text] components define a self-internal energy functional [Formula: see text] and [Formula: see text] components define the contribution to a free energy functional [Formula: see text] (the total correlation) of the k-body interactions. A basic mean field model is developed and computed on genetic data reproducing usual free energy landscapes with phase transition, sustaining the analogy of clustering with condensation. The set of information paths in simplicial structures is in bijection with the symmetric group and random processes, providing a trivial topological expression of the second law of thermodynamics. The local minima of free energy, related to conditional information negativity and conditional independence, characterize a minimum free energy complex. This complex formalizes the minimum free-energy principle in topology, provides a definition of a complex system and characterizes a multiplicity of local minima that quantifies the diversity observed in biology. I give an interpretation of this complex in terms of unsupervised deep learning where the neural network architecture is given by the chain complex and conclude by discussing future supervised applications. MDPI 2019-09-10 /pmc/articles/PMC7515411/ http://dx.doi.org/10.3390/e21090881 Text en © 2019 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Baudot, Pierre
The Poincaré-Shannon Machine: Statistical Physics and Machine Learning Aspects of Information Cohomology
title The Poincaré-Shannon Machine: Statistical Physics and Machine Learning Aspects of Information Cohomology
title_full The Poincaré-Shannon Machine: Statistical Physics and Machine Learning Aspects of Information Cohomology
title_fullStr The Poincaré-Shannon Machine: Statistical Physics and Machine Learning Aspects of Information Cohomology
title_full_unstemmed The Poincaré-Shannon Machine: Statistical Physics and Machine Learning Aspects of Information Cohomology
title_short The Poincaré-Shannon Machine: Statistical Physics and Machine Learning Aspects of Information Cohomology
title_sort poincaré-shannon machine: statistical physics and machine learning aspects of information cohomology
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7515411/
http://dx.doi.org/10.3390/e21090881
work_keys_str_mv AT baudotpierre thepoincareshannonmachinestatisticalphysicsandmachinelearningaspectsofinformationcohomology
AT baudotpierre poincareshannonmachinestatisticalphysicsandmachinelearningaspectsofinformationcohomology