Cargando…
Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation
Deep Reinforcement Learning (DRL) algorithms have been widely studied for sequential decision-making problems, and substantial progress has been achieved, especially in autonomous robotic skill learning. However, it is always difficult to deploy DRL methods in practical safety-critical robot systems...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9864208/ https://www.ncbi.nlm.nih.gov/pubmed/36679561 http://dx.doi.org/10.3390/s23020762 |
_version_ | 1784875526357254144 |
---|---|
author | Xiang, Guofei Dian, Songyi Du, Shaofeng Lv, Zhonghui |
author_facet | Xiang, Guofei Dian, Songyi Du, Shaofeng Lv, Zhonghui |
author_sort | Xiang, Guofei |
collection | PubMed |
description | Deep Reinforcement Learning (DRL) algorithms have been widely studied for sequential decision-making problems, and substantial progress has been achieved, especially in autonomous robotic skill learning. However, it is always difficult to deploy DRL methods in practical safety-critical robot systems, since the training and deployment environment gap always exists, and this issue would become increasingly crucial due to the ever-changing environment. Aiming at efficiently robotic skill transferring in a dynamic environment, we present a meta-reinforcement learning algorithm based on a variational information bottleneck. More specifically, during the meta-training stage, the variational information bottleneck first has been applied to infer the complete basic tasks for the whole task space, then the maximum entropy regularized reinforcement learning framework has been used to learn the basic skills consistent with that of basic tasks. Once the training stage is completed, all of the tasks in the task space can be obtained by a nonlinear combination of the basic tasks, thus, the according skills to accomplish the tasks can also be obtained by some way of a combination of the basic skills. Empirical results on several highly nonlinear, high-dimensional robotic locomotion tasks show that the proposed variational information bottleneck regularized deep reinforcement learning algorithm can improve sample efficiency by 200–5000 times on new tasks. Furthermore, the proposed algorithm achieves substantial asymptotic performance improvement. The results indicate that the proposed meta-reinforcement learning framework makes a significant step forward to deploy the DRL-based algorithm to practical robot systems. |
format | Online Article Text |
id | pubmed-9864208 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-98642082023-01-22 Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation Xiang, Guofei Dian, Songyi Du, Shaofeng Lv, Zhonghui Sensors (Basel) Article Deep Reinforcement Learning (DRL) algorithms have been widely studied for sequential decision-making problems, and substantial progress has been achieved, especially in autonomous robotic skill learning. However, it is always difficult to deploy DRL methods in practical safety-critical robot systems, since the training and deployment environment gap always exists, and this issue would become increasingly crucial due to the ever-changing environment. Aiming at efficiently robotic skill transferring in a dynamic environment, we present a meta-reinforcement learning algorithm based on a variational information bottleneck. More specifically, during the meta-training stage, the variational information bottleneck first has been applied to infer the complete basic tasks for the whole task space, then the maximum entropy regularized reinforcement learning framework has been used to learn the basic skills consistent with that of basic tasks. Once the training stage is completed, all of the tasks in the task space can be obtained by a nonlinear combination of the basic tasks, thus, the according skills to accomplish the tasks can also be obtained by some way of a combination of the basic skills. Empirical results on several highly nonlinear, high-dimensional robotic locomotion tasks show that the proposed variational information bottleneck regularized deep reinforcement learning algorithm can improve sample efficiency by 200–5000 times on new tasks. Furthermore, the proposed algorithm achieves substantial asymptotic performance improvement. The results indicate that the proposed meta-reinforcement learning framework makes a significant step forward to deploy the DRL-based algorithm to practical robot systems. MDPI 2023-01-09 /pmc/articles/PMC9864208/ /pubmed/36679561 http://dx.doi.org/10.3390/s23020762 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Xiang, Guofei Dian, Songyi Du, Shaofeng Lv, Zhonghui Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation |
title | Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation |
title_full | Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation |
title_fullStr | Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation |
title_full_unstemmed | Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation |
title_short | Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation |
title_sort | variational information bottleneck regularized deep reinforcement learning for efficient robotic skill adaptation |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9864208/ https://www.ncbi.nlm.nih.gov/pubmed/36679561 http://dx.doi.org/10.3390/s23020762 |
work_keys_str_mv | AT xiangguofei variationalinformationbottleneckregularizeddeepreinforcementlearningforefficientroboticskilladaptation AT diansongyi variationalinformationbottleneckregularizeddeepreinforcementlearningforefficientroboticskilladaptation AT dushaofeng variationalinformationbottleneckregularizeddeepreinforcementlearningforefficientroboticskilladaptation AT lvzhonghui variationalinformationbottleneckregularizeddeepreinforcementlearningforefficientroboticskilladaptation |