Cargando…

Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation

Deep Reinforcement Learning (DRL) algorithms have been widely studied for sequential decision-making problems, and substantial progress has been achieved, especially in autonomous robotic skill learning. However, it is always difficult to deploy DRL methods in practical safety-critical robot systems...

Descripción completa

Detalles Bibliográficos
Autores principales: Xiang, Guofei, Dian, Songyi, Du, Shaofeng, Lv, Zhonghui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9864208/
https://www.ncbi.nlm.nih.gov/pubmed/36679561
http://dx.doi.org/10.3390/s23020762
_version_ 1784875526357254144
author Xiang, Guofei
Dian, Songyi
Du, Shaofeng
Lv, Zhonghui
author_facet Xiang, Guofei
Dian, Songyi
Du, Shaofeng
Lv, Zhonghui
author_sort Xiang, Guofei
collection PubMed
description Deep Reinforcement Learning (DRL) algorithms have been widely studied for sequential decision-making problems, and substantial progress has been achieved, especially in autonomous robotic skill learning. However, it is always difficult to deploy DRL methods in practical safety-critical robot systems, since the training and deployment environment gap always exists, and this issue would become increasingly crucial due to the ever-changing environment. Aiming at efficiently robotic skill transferring in a dynamic environment, we present a meta-reinforcement learning algorithm based on a variational information bottleneck. More specifically, during the meta-training stage, the variational information bottleneck first has been applied to infer the complete basic tasks for the whole task space, then the maximum entropy regularized reinforcement learning framework has been used to learn the basic skills consistent with that of basic tasks. Once the training stage is completed, all of the tasks in the task space can be obtained by a nonlinear combination of the basic tasks, thus, the according skills to accomplish the tasks can also be obtained by some way of a combination of the basic skills. Empirical results on several highly nonlinear, high-dimensional robotic locomotion tasks show that the proposed variational information bottleneck regularized deep reinforcement learning algorithm can improve sample efficiency by 200–5000 times on new tasks. Furthermore, the proposed algorithm achieves substantial asymptotic performance improvement. The results indicate that the proposed meta-reinforcement learning framework makes a significant step forward to deploy the DRL-based algorithm to practical robot systems.
format Online
Article
Text
id pubmed-9864208
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-98642082023-01-22 Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation Xiang, Guofei Dian, Songyi Du, Shaofeng Lv, Zhonghui Sensors (Basel) Article Deep Reinforcement Learning (DRL) algorithms have been widely studied for sequential decision-making problems, and substantial progress has been achieved, especially in autonomous robotic skill learning. However, it is always difficult to deploy DRL methods in practical safety-critical robot systems, since the training and deployment environment gap always exists, and this issue would become increasingly crucial due to the ever-changing environment. Aiming at efficiently robotic skill transferring in a dynamic environment, we present a meta-reinforcement learning algorithm based on a variational information bottleneck. More specifically, during the meta-training stage, the variational information bottleneck first has been applied to infer the complete basic tasks for the whole task space, then the maximum entropy regularized reinforcement learning framework has been used to learn the basic skills consistent with that of basic tasks. Once the training stage is completed, all of the tasks in the task space can be obtained by a nonlinear combination of the basic tasks, thus, the according skills to accomplish the tasks can also be obtained by some way of a combination of the basic skills. Empirical results on several highly nonlinear, high-dimensional robotic locomotion tasks show that the proposed variational information bottleneck regularized deep reinforcement learning algorithm can improve sample efficiency by 200–5000 times on new tasks. Furthermore, the proposed algorithm achieves substantial asymptotic performance improvement. The results indicate that the proposed meta-reinforcement learning framework makes a significant step forward to deploy the DRL-based algorithm to practical robot systems. MDPI 2023-01-09 /pmc/articles/PMC9864208/ /pubmed/36679561 http://dx.doi.org/10.3390/s23020762 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Xiang, Guofei
Dian, Songyi
Du, Shaofeng
Lv, Zhonghui
Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation
title Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation
title_full Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation
title_fullStr Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation
title_full_unstemmed Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation
title_short Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation
title_sort variational information bottleneck regularized deep reinforcement learning for efficient robotic skill adaptation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9864208/
https://www.ncbi.nlm.nih.gov/pubmed/36679561
http://dx.doi.org/10.3390/s23020762
work_keys_str_mv AT xiangguofei variationalinformationbottleneckregularizeddeepreinforcementlearningforefficientroboticskilladaptation
AT diansongyi variationalinformationbottleneckregularizeddeepreinforcementlearningforefficientroboticskilladaptation
AT dushaofeng variationalinformationbottleneckregularizeddeepreinforcementlearningforefficientroboticskilladaptation
AT lvzhonghui variationalinformationbottleneckregularizeddeepreinforcementlearningforefficientroboticskilladaptation