Cargando…

Reviewing Evolution of Learning Functions and Semantic Information Measures for Understanding Deep Learning

A new trend in deep learning, represented by Mutual Information Neural Estimation (MINE) and Information Noise Contrast Estimation (InfoNCE), is emerging. In this trend, similarity functions and Estimated Mutual Information (EMI) are used as learning and objective functions. Coincidentally, EMI is e...

Descripción completa

Detalles Bibliográficos
Autor principal: Lu, Chenguang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10217299/
https://www.ncbi.nlm.nih.gov/pubmed/37238557
http://dx.doi.org/10.3390/e25050802
_version_ 1785048503249010688
author Lu, Chenguang
author_facet Lu, Chenguang
author_sort Lu, Chenguang
collection PubMed
description A new trend in deep learning, represented by Mutual Information Neural Estimation (MINE) and Information Noise Contrast Estimation (InfoNCE), is emerging. In this trend, similarity functions and Estimated Mutual Information (EMI) are used as learning and objective functions. Coincidentally, EMI is essentially the same as Semantic Mutual Information (SeMI) proposed by the author 30 years ago. This paper first reviews the evolutionary histories of semantic information measures and learning functions. Then, it briefly introduces the author’s semantic information G theory with the rate-fidelity function R(G) (G denotes SeMI, and R(G) extends R(D)) and its applications to multi-label learning, the maximum Mutual Information (MI) classification, and mixture models. Then it discusses how we should understand the relationship between SeMI and Shannon’s MI, two generalized entropies (fuzzy entropy and coverage entropy), Autoencoders, Gibbs distributions, and partition functions from the perspective of the R(G) function or the G theory. An important conclusion is that mixture models and Restricted Boltzmann Machines converge because SeMI is maximized, and Shannon’s MI is minimized, making information efficiency G/R close to 1. A potential opportunity is to simplify deep learning by using Gaussian channel mixture models for pre-training deep neural networks’ latent layers without considering gradients. It also discusses how the SeMI measure is used as the reward function (reflecting purposiveness) for reinforcement learning. The G theory helps interpret deep learning but is far from enough. Combining semantic information theory and deep learning will accelerate their development.
format Online
Article
Text
id pubmed-10217299
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-102172992023-05-27 Reviewing Evolution of Learning Functions and Semantic Information Measures for Understanding Deep Learning Lu, Chenguang Entropy (Basel) Review A new trend in deep learning, represented by Mutual Information Neural Estimation (MINE) and Information Noise Contrast Estimation (InfoNCE), is emerging. In this trend, similarity functions and Estimated Mutual Information (EMI) are used as learning and objective functions. Coincidentally, EMI is essentially the same as Semantic Mutual Information (SeMI) proposed by the author 30 years ago. This paper first reviews the evolutionary histories of semantic information measures and learning functions. Then, it briefly introduces the author’s semantic information G theory with the rate-fidelity function R(G) (G denotes SeMI, and R(G) extends R(D)) and its applications to multi-label learning, the maximum Mutual Information (MI) classification, and mixture models. Then it discusses how we should understand the relationship between SeMI and Shannon’s MI, two generalized entropies (fuzzy entropy and coverage entropy), Autoencoders, Gibbs distributions, and partition functions from the perspective of the R(G) function or the G theory. An important conclusion is that mixture models and Restricted Boltzmann Machines converge because SeMI is maximized, and Shannon’s MI is minimized, making information efficiency G/R close to 1. A potential opportunity is to simplify deep learning by using Gaussian channel mixture models for pre-training deep neural networks’ latent layers without considering gradients. It also discusses how the SeMI measure is used as the reward function (reflecting purposiveness) for reinforcement learning. The G theory helps interpret deep learning but is far from enough. Combining semantic information theory and deep learning will accelerate their development. MDPI 2023-05-15 /pmc/articles/PMC10217299/ /pubmed/37238557 http://dx.doi.org/10.3390/e25050802 Text en © 2023 by the author. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Review
Lu, Chenguang
Reviewing Evolution of Learning Functions and Semantic Information Measures for Understanding Deep Learning
title Reviewing Evolution of Learning Functions and Semantic Information Measures for Understanding Deep Learning
title_full Reviewing Evolution of Learning Functions and Semantic Information Measures for Understanding Deep Learning
title_fullStr Reviewing Evolution of Learning Functions and Semantic Information Measures for Understanding Deep Learning
title_full_unstemmed Reviewing Evolution of Learning Functions and Semantic Information Measures for Understanding Deep Learning
title_short Reviewing Evolution of Learning Functions and Semantic Information Measures for Understanding Deep Learning
title_sort reviewing evolution of learning functions and semantic information measures for understanding deep learning
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10217299/
https://www.ncbi.nlm.nih.gov/pubmed/37238557
http://dx.doi.org/10.3390/e25050802
work_keys_str_mv AT luchenguang reviewingevolutionoflearningfunctionsandsemanticinformationmeasuresforunderstandingdeeplearning