Cargando…

Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation

Deep Reinforcement Learning (DRL) algorithms have been widely studied for sequential decision-making problems, and substantial progress has been achieved, especially in autonomous robotic skill learning. However, it is always difficult to deploy DRL methods in practical safety-critical robot systems...

Descripción completa

Detalles Bibliográficos
Autores principales:	Xiang, Guofei, Dian, Songyi, Du, Shaofeng, Lv, Zhonghui
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9864208/ https://www.ncbi.nlm.nih.gov/pubmed/36679561 http://dx.doi.org/10.3390/s23020762

_version_	1784875526357254144
author	Xiang, Guofei Dian, Songyi Du, Shaofeng Lv, Zhonghui
author_facet	Xiang, Guofei Dian, Songyi Du, Shaofeng Lv, Zhonghui
author_sort	Xiang, Guofei
collection	PubMed
description	Deep Reinforcement Learning (DRL) algorithms have been widely studied for sequential decision-making problems, and substantial progress has been achieved, especially in autonomous robotic skill learning. However, it is always difficult to deploy DRL methods in practical safety-critical robot systems, since the training and deployment environment gap always exists, and this issue would become increasingly crucial due to the ever-changing environment. Aiming at efficiently robotic skill transferring in a dynamic environment, we present a meta-reinforcement learning algorithm based on a variational information bottleneck. More specifically, during the meta-training stage, the variational information bottleneck first has been applied to infer the complete basic tasks for the whole task space, then the maximum entropy regularized reinforcement learning framework has been used to learn the basic skills consistent with that of basic tasks. Once the training stage is completed, all of the tasks in the task space can be obtained by a nonlinear combination of the basic tasks, thus, the according skills to accomplish the tasks can also be obtained by some way of a combination of the basic skills. Empirical results on several highly nonlinear, high-dimensional robotic locomotion tasks show that the proposed variational information bottleneck regularized deep reinforcement learning algorithm can improve sample efficiency by 200–5000 times on new tasks. Furthermore, the proposed algorithm achieves substantial asymptotic performance improvement. The results indicate that the proposed meta-reinforcement learning framework makes a significant step forward to deploy the DRL-based algorithm to practical robot systems.
format	Online Article Text
id	pubmed-9864208
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-98642082023-01-22 Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation Xiang, Guofei Dian, Songyi Du, Shaofeng Lv, Zhonghui Sensors (Basel) Article Deep Reinforcement Learning (DRL) algorithms have been widely studied for sequential decision-making problems, and substantial progress has been achieved, especially in autonomous robotic skill learning. However, it is always difficult to deploy DRL methods in practical safety-critical robot systems, since the training and deployment environment gap always exists, and this issue would become increasingly crucial due to the ever-changing environment. Aiming at efficiently robotic skill transferring in a dynamic environment, we present a meta-reinforcement learning algorithm based on a variational information bottleneck. More specifically, during the meta-training stage, the variational information bottleneck first has been applied to infer the complete basic tasks for the whole task space, then the maximum entropy regularized reinforcement learning framework has been used to learn the basic skills consistent with that of basic tasks. Once the training stage is completed, all of the tasks in the task space can be obtained by a nonlinear combination of the basic tasks, thus, the according skills to accomplish the tasks can also be obtained by some way of a combination of the basic skills. Empirical results on several highly nonlinear, high-dimensional robotic locomotion tasks show that the proposed variational information bottleneck regularized deep reinforcement learning algorithm can improve sample efficiency by 200–5000 times on new tasks. Furthermore, the proposed algorithm achieves substantial asymptotic performance improvement. The results indicate that the proposed meta-reinforcement learning framework makes a significant step forward to deploy the DRL-based algorithm to practical robot systems. MDPI 2023-01-09 /pmc/articles/PMC9864208/ /pubmed/36679561 http://dx.doi.org/10.3390/s23020762 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Xiang, Guofei Dian, Songyi Du, Shaofeng Lv, Zhonghui Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation
title	Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation
title_full	Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation
title_fullStr	Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation
title_full_unstemmed	Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation
title_short	Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation
title_sort	variational information bottleneck regularized deep reinforcement learning for efficient robotic skill adaptation
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9864208/ https://www.ncbi.nlm.nih.gov/pubmed/36679561 http://dx.doi.org/10.3390/s23020762
work_keys_str_mv	AT xiangguofei variationalinformationbottleneckregularizeddeepreinforcementlearningforefficientroboticskilladaptation AT diansongyi variationalinformationbottleneckregularizeddeepreinforcementlearningforefficientroboticskilladaptation AT dushaofeng variationalinformationbottleneckregularizeddeepreinforcementlearningforefficientroboticskilladaptation AT lvzhonghui variationalinformationbottleneckregularizeddeepreinforcementlearningforefficientroboticskilladaptation

Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation

Ejemplares similares