Cargando…

An information theoretic score for learning hierarchical concepts

How do humans learn the regularities of their complex noisy world in a robust manner? There is ample evidence that much of this learning and development occurs in an unsupervised fashion via interactions with the environment. Both the structure of the world as well as the brain appear hierarchical i...

Descripción completa

Detalles Bibliográficos
Autor principal:	Madani, Omid
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2023
Materias:	Neuroscience
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10185805/ https://www.ncbi.nlm.nih.gov/pubmed/37201121 http://dx.doi.org/10.3389/fncom.2023.1082502

_version_	1785042436658036736
author	Madani, Omid
author_facet	Madani, Omid
author_sort	Madani, Omid
collection	PubMed
description	How do humans learn the regularities of their complex noisy world in a robust manner? There is ample evidence that much of this learning and development occurs in an unsupervised fashion via interactions with the environment. Both the structure of the world as well as the brain appear hierarchical in a number of ways, and structured hierarchical representations offer potential benefits for efficient learning and organization of knowledge, such as concepts (patterns) sharing parts (subpatterns), and for providing a foundation for symbolic computation and language. A major question arises: what drives the processes behind acquiring such hierarchical spatiotemporal concepts? We posit that the goal of advancing one's predictions is a major driver for learning such hierarchies and introduce an information-theoretic score that shows promise in guiding the processes, and, in particular, motivating the learner to build larger concepts. We have been exploring the challenges of building an integrated learning and developing system within the framework of prediction games, wherein concepts serve as (1) predictors, (2) targets of prediction, and (3) building blocks for future higher-level concepts. Our current implementation works on raw text: it begins at a low level, such as characters, which are the hardwired or primitive concepts, and grows its vocabulary of networked hierarchical concepts over time. Concepts are strings or n-grams in our current realization, but we hope to relax this limitation, e.g., to a larger subclass of finite automata. After an overview of the current system, we focus on the score, named CORE. CORE is based on comparing the prediction performance of the system with a simple baseline system that is limited to predicting with the primitives. CORE incorporates a tradeoff between how strongly a concept is predicted (or how well it fits its context, i.e., nearby predicted concepts) vs. how well it matches the (ground) “reality,” i.e., the lowest level observations (the characters in the input episode). CORE is applicable to generative models such as probabilistic finite state machines (beyond strings). We highlight a few properties of CORE with examples. The learning is scalable and open-ended. For instance, thousands of concepts are learned after hundreds of thousands of episodes. We give examples of what is learned, and we also empirically compare with transformer neural networks and n-gram language models to situate the current implementation with respect to state-of-the-art and to further illustrate the similarities and differences with existing techniques. We touch on a variety of challenges and promising future directions in advancing the approach, in particular, the challenge of learning concepts with a more sophisticated structure.
format	Online Article Text
id	pubmed-10185805
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-101858052023-05-17 An information theoretic score for learning hierarchical concepts Madani, Omid Front Comput Neurosci Neuroscience How do humans learn the regularities of their complex noisy world in a robust manner? There is ample evidence that much of this learning and development occurs in an unsupervised fashion via interactions with the environment. Both the structure of the world as well as the brain appear hierarchical in a number of ways, and structured hierarchical representations offer potential benefits for efficient learning and organization of knowledge, such as concepts (patterns) sharing parts (subpatterns), and for providing a foundation for symbolic computation and language. A major question arises: what drives the processes behind acquiring such hierarchical spatiotemporal concepts? We posit that the goal of advancing one's predictions is a major driver for learning such hierarchies and introduce an information-theoretic score that shows promise in guiding the processes, and, in particular, motivating the learner to build larger concepts. We have been exploring the challenges of building an integrated learning and developing system within the framework of prediction games, wherein concepts serve as (1) predictors, (2) targets of prediction, and (3) building blocks for future higher-level concepts. Our current implementation works on raw text: it begins at a low level, such as characters, which are the hardwired or primitive concepts, and grows its vocabulary of networked hierarchical concepts over time. Concepts are strings or n-grams in our current realization, but we hope to relax this limitation, e.g., to a larger subclass of finite automata. After an overview of the current system, we focus on the score, named CORE. CORE is based on comparing the prediction performance of the system with a simple baseline system that is limited to predicting with the primitives. CORE incorporates a tradeoff between how strongly a concept is predicted (or how well it fits its context, i.e., nearby predicted concepts) vs. how well it matches the (ground) “reality,” i.e., the lowest level observations (the characters in the input episode). CORE is applicable to generative models such as probabilistic finite state machines (beyond strings). We highlight a few properties of CORE with examples. The learning is scalable and open-ended. For instance, thousands of concepts are learned after hundreds of thousands of episodes. We give examples of what is learned, and we also empirically compare with transformer neural networks and n-gram language models to situate the current implementation with respect to state-of-the-art and to further illustrate the similarities and differences with existing techniques. We touch on a variety of challenges and promising future directions in advancing the approach, in particular, the challenge of learning concepts with a more sophisticated structure. Frontiers Media S.A. 2023-05-02 /pmc/articles/PMC10185805/ /pubmed/37201121 http://dx.doi.org/10.3389/fncom.2023.1082502 Text en Copyright © 2023 Madani. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Neuroscience Madani, Omid An information theoretic score for learning hierarchical concepts
title	An information theoretic score for learning hierarchical concepts
title_full	An information theoretic score for learning hierarchical concepts
title_fullStr	An information theoretic score for learning hierarchical concepts
title_full_unstemmed	An information theoretic score for learning hierarchical concepts
title_short	An information theoretic score for learning hierarchical concepts
title_sort	information theoretic score for learning hierarchical concepts
topic	Neuroscience
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10185805/ https://www.ncbi.nlm.nih.gov/pubmed/37201121 http://dx.doi.org/10.3389/fncom.2023.1082502
work_keys_str_mv	AT madaniomid aninformationtheoreticscoreforlearninghierarchicalconcepts AT madaniomid informationtheoreticscoreforlearninghierarchicalconcepts

An information theoretic score for learning hierarchical concepts

Ejemplares similares