Cargando…
Generating Adaptive Behaviour within a Memory-Prediction Framework
The Memory-Prediction Framework (MPF) and its Hierarchical-Temporal Memory implementation (HTM) have been widely applied to unsupervised learning problems, for both classification and prediction. To date, there has been no attempt to incorporate MPF/HTM in reinforcement learning or other adaptive sy...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3260147/ https://www.ncbi.nlm.nih.gov/pubmed/22272231 http://dx.doi.org/10.1371/journal.pone.0029264 |
_version_ | 1782221444856414208 |
---|---|
author | Rawlinson, David Kowadlo, Gideon |
author_facet | Rawlinson, David Kowadlo, Gideon |
author_sort | Rawlinson, David |
collection | PubMed |
description | The Memory-Prediction Framework (MPF) and its Hierarchical-Temporal Memory implementation (HTM) have been widely applied to unsupervised learning problems, for both classification and prediction. To date, there has been no attempt to incorporate MPF/HTM in reinforcement learning or other adaptive systems; that is, to use knowledge embodied within the hierarchy to control a system, or to generate behaviour for an agent. This problem is interesting because the human neocortex is believed to play a vital role in the generation of behaviour, and the MPF is a model of the human neocortex. We propose some simple and biologically-plausible enhancements to the Memory-Prediction Framework. These cause it to explore and interact with an external world, while trying to maximize a continuous, time-varying reward function. All behaviour is generated and controlled within the MPF hierarchy. The hierarchy develops from a random initial configuration by interaction with the world and reinforcement learning only. Among other demonstrations, we show that a 2-node hierarchy can learn to successfully play “rocks, paper, scissors” against a predictable opponent. |
format | Online Article Text |
id | pubmed-3260147 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-32601472012-01-23 Generating Adaptive Behaviour within a Memory-Prediction Framework Rawlinson, David Kowadlo, Gideon PLoS One Research Article The Memory-Prediction Framework (MPF) and its Hierarchical-Temporal Memory implementation (HTM) have been widely applied to unsupervised learning problems, for both classification and prediction. To date, there has been no attempt to incorporate MPF/HTM in reinforcement learning or other adaptive systems; that is, to use knowledge embodied within the hierarchy to control a system, or to generate behaviour for an agent. This problem is interesting because the human neocortex is believed to play a vital role in the generation of behaviour, and the MPF is a model of the human neocortex. We propose some simple and biologically-plausible enhancements to the Memory-Prediction Framework. These cause it to explore and interact with an external world, while trying to maximize a continuous, time-varying reward function. All behaviour is generated and controlled within the MPF hierarchy. The hierarchy develops from a random initial configuration by interaction with the world and reinforcement learning only. Among other demonstrations, we show that a 2-node hierarchy can learn to successfully play “rocks, paper, scissors” against a predictable opponent. Public Library of Science 2012-01-17 /pmc/articles/PMC3260147/ /pubmed/22272231 http://dx.doi.org/10.1371/journal.pone.0029264 Text en Rawlinson, Kowadlo. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Rawlinson, David Kowadlo, Gideon Generating Adaptive Behaviour within a Memory-Prediction Framework |
title | Generating Adaptive Behaviour within a Memory-Prediction Framework |
title_full | Generating Adaptive Behaviour within a Memory-Prediction Framework |
title_fullStr | Generating Adaptive Behaviour within a Memory-Prediction Framework |
title_full_unstemmed | Generating Adaptive Behaviour within a Memory-Prediction Framework |
title_short | Generating Adaptive Behaviour within a Memory-Prediction Framework |
title_sort | generating adaptive behaviour within a memory-prediction framework |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3260147/ https://www.ncbi.nlm.nih.gov/pubmed/22272231 http://dx.doi.org/10.1371/journal.pone.0029264 |
work_keys_str_mv | AT rawlinsondavid generatingadaptivebehaviourwithinamemorypredictionframework AT kowadlogideon generatingadaptivebehaviourwithinamemorypredictionframework |