Cargando…
Detecting Changes and Avoiding Catastrophic Forgetting in Dynamic Partially Observable Environments
The ability of an agent to detect changes in an environment is key to successful adaptation. This ability involves at least two phases: learning a model of an environment, and detecting that a change is likely to have occurred when this model is no longer accurate. This task is particularly challeng...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7787001/ https://www.ncbi.nlm.nih.gov/pubmed/33424575 http://dx.doi.org/10.3389/fnbot.2020.578675 |
_version_ | 1783632742650478592 |
---|---|
author | Dick, Jeffery Ladosz, Pawel Ben-Iwhiwhu, Eseoghene Shimadzu, Hideyasu Kinnell, Peter Pilly, Praveen K. Kolouri, Soheil Soltoggio, Andrea |
author_facet | Dick, Jeffery Ladosz, Pawel Ben-Iwhiwhu, Eseoghene Shimadzu, Hideyasu Kinnell, Peter Pilly, Praveen K. Kolouri, Soheil Soltoggio, Andrea |
author_sort | Dick, Jeffery |
collection | PubMed |
description | The ability of an agent to detect changes in an environment is key to successful adaptation. This ability involves at least two phases: learning a model of an environment, and detecting that a change is likely to have occurred when this model is no longer accurate. This task is particularly challenging in partially observable environments, such as those modeled with partially observable Markov decision processes (POMDPs). Some predictive learners are able to infer the state from observations and thus perform better with partial observability. Predictive state representations (PSRs) and neural networks are two such tools that can be trained to predict the probabilities of future observations. However, most such existing methods focus primarily on static problems in which only one environment is learned. In this paper, we propose an algorithm that uses statistical tests to estimate the probability of different predictive models to fit the current environment. We exploit the underlying probability distributions of predictive models to provide a fast and explainable method to assess and justify the model's beliefs about the current environment. Crucially, by doing so, the method can label incoming data as fitting different models, and thus can continuously train separate models in different environments. This new method is shown to prevent catastrophic forgetting when new environments, or tasks, are encountered. The method can also be of use when AI-informed decisions require justifications because its beliefs are based on statistical evidence from observations. We empirically demonstrate the benefit of the novel method with simulations in a set of POMDP environments. |
format | Online Article Text |
id | pubmed-7787001 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-77870012021-01-07 Detecting Changes and Avoiding Catastrophic Forgetting in Dynamic Partially Observable Environments Dick, Jeffery Ladosz, Pawel Ben-Iwhiwhu, Eseoghene Shimadzu, Hideyasu Kinnell, Peter Pilly, Praveen K. Kolouri, Soheil Soltoggio, Andrea Front Neurorobot Neuroscience The ability of an agent to detect changes in an environment is key to successful adaptation. This ability involves at least two phases: learning a model of an environment, and detecting that a change is likely to have occurred when this model is no longer accurate. This task is particularly challenging in partially observable environments, such as those modeled with partially observable Markov decision processes (POMDPs). Some predictive learners are able to infer the state from observations and thus perform better with partial observability. Predictive state representations (PSRs) and neural networks are two such tools that can be trained to predict the probabilities of future observations. However, most such existing methods focus primarily on static problems in which only one environment is learned. In this paper, we propose an algorithm that uses statistical tests to estimate the probability of different predictive models to fit the current environment. We exploit the underlying probability distributions of predictive models to provide a fast and explainable method to assess and justify the model's beliefs about the current environment. Crucially, by doing so, the method can label incoming data as fitting different models, and thus can continuously train separate models in different environments. This new method is shown to prevent catastrophic forgetting when new environments, or tasks, are encountered. The method can also be of use when AI-informed decisions require justifications because its beliefs are based on statistical evidence from observations. We empirically demonstrate the benefit of the novel method with simulations in a set of POMDP environments. Frontiers Media S.A. 2020-12-23 /pmc/articles/PMC7787001/ /pubmed/33424575 http://dx.doi.org/10.3389/fnbot.2020.578675 Text en Copyright © 2020 Dick, Ladosz, Ben-Iwhiwhu, Shimadzu, Kinnell, Pilly, Kolouri and Soltoggio. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Neuroscience Dick, Jeffery Ladosz, Pawel Ben-Iwhiwhu, Eseoghene Shimadzu, Hideyasu Kinnell, Peter Pilly, Praveen K. Kolouri, Soheil Soltoggio, Andrea Detecting Changes and Avoiding Catastrophic Forgetting in Dynamic Partially Observable Environments |
title | Detecting Changes and Avoiding Catastrophic Forgetting in Dynamic Partially Observable Environments |
title_full | Detecting Changes and Avoiding Catastrophic Forgetting in Dynamic Partially Observable Environments |
title_fullStr | Detecting Changes and Avoiding Catastrophic Forgetting in Dynamic Partially Observable Environments |
title_full_unstemmed | Detecting Changes and Avoiding Catastrophic Forgetting in Dynamic Partially Observable Environments |
title_short | Detecting Changes and Avoiding Catastrophic Forgetting in Dynamic Partially Observable Environments |
title_sort | detecting changes and avoiding catastrophic forgetting in dynamic partially observable environments |
topic | Neuroscience |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7787001/ https://www.ncbi.nlm.nih.gov/pubmed/33424575 http://dx.doi.org/10.3389/fnbot.2020.578675 |
work_keys_str_mv | AT dickjeffery detectingchangesandavoidingcatastrophicforgettingindynamicpartiallyobservableenvironments AT ladoszpawel detectingchangesandavoidingcatastrophicforgettingindynamicpartiallyobservableenvironments AT beniwhiwhueseoghene detectingchangesandavoidingcatastrophicforgettingindynamicpartiallyobservableenvironments AT shimadzuhideyasu detectingchangesandavoidingcatastrophicforgettingindynamicpartiallyobservableenvironments AT kinnellpeter detectingchangesandavoidingcatastrophicforgettingindynamicpartiallyobservableenvironments AT pillypraveenk detectingchangesandavoidingcatastrophicforgettingindynamicpartiallyobservableenvironments AT kolourisoheil detectingchangesandavoidingcatastrophicforgettingindynamicpartiallyobservableenvironments AT soltoggioandrea detectingchangesandavoidingcatastrophicforgettingindynamicpartiallyobservableenvironments |