Cargando…

Examining the Use of Temporal-Difference Incremental Delta-Bar-Delta for Real-World Predictive Knowledge Architectures

Predictions and predictive knowledge have seen recent success in improving not only robot control but also other applications ranging from industrial process control to rehabilitation. A property that makes these predictive approaches well-suited for robotics is that they can be learned online and i...

Descripción completa

Detalles Bibliográficos
Autores principales:	Günther, Johannes, Ady, Nadia M., Kearney, Alex, Dawson, Michael R., Pilarski, Patrick M.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2020
Materias:	Robotics and AI
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7805647/ https://www.ncbi.nlm.nih.gov/pubmed/33501202 http://dx.doi.org/10.3389/frobt.2020.00034

_version_	1783636347741798400
author	Günther, Johannes Ady, Nadia M. Kearney, Alex Dawson, Michael R. Pilarski, Patrick M.
author_facet	Günther, Johannes Ady, Nadia M. Kearney, Alex Dawson, Michael R. Pilarski, Patrick M.
author_sort	Günther, Johannes
collection	PubMed
description	Predictions and predictive knowledge have seen recent success in improving not only robot control but also other applications ranging from industrial process control to rehabilitation. A property that makes these predictive approaches well-suited for robotics is that they can be learned online and incrementally through interaction with the environment. However, a remaining challenge for many prediction-learning approaches is an appropriate choice of prediction-learning parameters, especially parameters that control the magnitude of a learning machine's updates to its predictions (the learning rates or step sizes). Typically, these parameters are chosen based on an extensive parameter search—an approach that neither scales well nor is well-suited for tasks that require changing step sizes due to non-stationarity. To begin to address this challenge, we examine the use of online step-size adaptation using the Modular Prosthetic Limb: a sensor-rich robotic arm intended for use by persons with amputations. Our method of choice, Temporal-Difference Incremental Delta-Bar-Delta (TIDBD), learns and adapts step sizes on a feature level; importantly, TIDBD allows step-size tuning and representation learning to occur at the same time. As a first contribution, we show that TIDBD is a practical alternative for classic Temporal-Difference (TD) learning via an extensive parameter search. Both approaches perform comparably in terms of predicting future aspects of a robotic data stream, but TD only achieves comparable performance with a carefully hand-tuned learning rate, while TIDBD uses a robust meta-parameter and tunes its own learning rates. Secondly, our results show that for this particular application TIDBD allows the system to automatically detect patterns characteristic of sensor failures common to a number of robotic applications. As a third contribution, we investigate the sensitivity of classic TD and TIDBD with respect to the initial step-size values on our robotic data set, reaffirming the robustness of TIDBD as shown in previous papers. Together, these results promise to improve the ability of robotic devices to learn from interactions with their environments in a robust way, providing key capabilities for autonomous agents and robots.
format	Online Article Text
id	pubmed-7805647
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-78056472021-01-25 Examining the Use of Temporal-Difference Incremental Delta-Bar-Delta for Real-World Predictive Knowledge Architectures Günther, Johannes Ady, Nadia M. Kearney, Alex Dawson, Michael R. Pilarski, Patrick M. Front Robot AI Robotics and AI Predictions and predictive knowledge have seen recent success in improving not only robot control but also other applications ranging from industrial process control to rehabilitation. A property that makes these predictive approaches well-suited for robotics is that they can be learned online and incrementally through interaction with the environment. However, a remaining challenge for many prediction-learning approaches is an appropriate choice of prediction-learning parameters, especially parameters that control the magnitude of a learning machine's updates to its predictions (the learning rates or step sizes). Typically, these parameters are chosen based on an extensive parameter search—an approach that neither scales well nor is well-suited for tasks that require changing step sizes due to non-stationarity. To begin to address this challenge, we examine the use of online step-size adaptation using the Modular Prosthetic Limb: a sensor-rich robotic arm intended for use by persons with amputations. Our method of choice, Temporal-Difference Incremental Delta-Bar-Delta (TIDBD), learns and adapts step sizes on a feature level; importantly, TIDBD allows step-size tuning and representation learning to occur at the same time. As a first contribution, we show that TIDBD is a practical alternative for classic Temporal-Difference (TD) learning via an extensive parameter search. Both approaches perform comparably in terms of predicting future aspects of a robotic data stream, but TD only achieves comparable performance with a carefully hand-tuned learning rate, while TIDBD uses a robust meta-parameter and tunes its own learning rates. Secondly, our results show that for this particular application TIDBD allows the system to automatically detect patterns characteristic of sensor failures common to a number of robotic applications. As a third contribution, we investigate the sensitivity of classic TD and TIDBD with respect to the initial step-size values on our robotic data set, reaffirming the robustness of TIDBD as shown in previous papers. Together, these results promise to improve the ability of robotic devices to learn from interactions with their environments in a robust way, providing key capabilities for autonomous agents and robots. Frontiers Media S.A. 2020-03-13 /pmc/articles/PMC7805647/ /pubmed/33501202 http://dx.doi.org/10.3389/frobt.2020.00034 Text en Copyright © 2020 Günther, Ady, Kearney, Dawson and Pilarski. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Robotics and AI Günther, Johannes Ady, Nadia M. Kearney, Alex Dawson, Michael R. Pilarski, Patrick M. Examining the Use of Temporal-Difference Incremental Delta-Bar-Delta for Real-World Predictive Knowledge Architectures
title	Examining the Use of Temporal-Difference Incremental Delta-Bar-Delta for Real-World Predictive Knowledge Architectures
title_full	Examining the Use of Temporal-Difference Incremental Delta-Bar-Delta for Real-World Predictive Knowledge Architectures
title_fullStr	Examining the Use of Temporal-Difference Incremental Delta-Bar-Delta for Real-World Predictive Knowledge Architectures
title_full_unstemmed	Examining the Use of Temporal-Difference Incremental Delta-Bar-Delta for Real-World Predictive Knowledge Architectures
title_short	Examining the Use of Temporal-Difference Incremental Delta-Bar-Delta for Real-World Predictive Knowledge Architectures
title_sort	examining the use of temporal-difference incremental delta-bar-delta for real-world predictive knowledge architectures
topic	Robotics and AI
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7805647/ https://www.ncbi.nlm.nih.gov/pubmed/33501202 http://dx.doi.org/10.3389/frobt.2020.00034
work_keys_str_mv	AT guntherjohannes examiningtheuseoftemporaldifferenceincrementaldeltabardeltaforrealworldpredictiveknowledgearchitectures AT adynadiam examiningtheuseoftemporaldifferenceincrementaldeltabardeltaforrealworldpredictiveknowledgearchitectures AT kearneyalex examiningtheuseoftemporaldifferenceincrementaldeltabardeltaforrealworldpredictiveknowledgearchitectures AT dawsonmichaelr examiningtheuseoftemporaldifferenceincrementaldeltabardeltaforrealworldpredictiveknowledgearchitectures AT pilarskipatrickm examiningtheuseoftemporaldifferenceincrementaldeltabardeltaforrealworldpredictiveknowledgearchitectures

Examining the Use of Temporal-Difference Incremental Delta-Bar-Delta for Real-World Predictive Knowledge Architectures

Ejemplares similares