Cargando…

Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration

Reinforcement learning (RL) empowers the agent to learn robotic manipulation skills autonomously. Compared with traditional single-goal RL, semantic-goal-conditioned RL expands the agent capacity to accomplish multiple semantic manipulation instructions. However, due to sparsely distributed semantic...

Descripción completa

Detalles Bibliográficos
Autores principales: Han, Changlin, Peng, Zhiyong, Liu, Yadong, Tang, Jingsheng, Yu, Yang, Zhou, Zongtan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10028088/
https://www.ncbi.nlm.nih.gov/pubmed/36960195
http://dx.doi.org/10.3389/fnbot.2023.1089270
_version_ 1784909863958085632
author Han, Changlin
Peng, Zhiyong
Liu, Yadong
Tang, Jingsheng
Yu, Yang
Zhou, Zongtan
author_facet Han, Changlin
Peng, Zhiyong
Liu, Yadong
Tang, Jingsheng
Yu, Yang
Zhou, Zongtan
author_sort Han, Changlin
collection PubMed
description Reinforcement learning (RL) empowers the agent to learn robotic manipulation skills autonomously. Compared with traditional single-goal RL, semantic-goal-conditioned RL expands the agent capacity to accomplish multiple semantic manipulation instructions. However, due to sparsely distributed semantic goals and sparse-reward agent-environment interactions, the hard exploration problem arises and impedes the agent training process. In traditional RL, curiosity-motivated exploration shows effectiveness in solving the hard exploration problem. However, in semantic-goal-conditioned RL, the performance of previous curiosity-motivated methods deteriorates, which we propose is because of their two defects: uncontrollability and distraction. To solve these defects, we propose a conservative curiosity-motivated method named mutual information motivation with hybrid policy mechanism (MIHM). MIHM mainly contributes two innovations: the decoupled-mutual-information-based intrinsic motivation, which prevents the agent from being motivated to explore dangerous states by uncontrollable curiosity; the precisely trained and automatically switched hybrid policy mechanism, which eliminates the distraction from the curiosity-motivated policy and achieves the optimal utilization of exploration and exploitation. Compared with four state-of-the-art curiosity-motivated methods in the sparse-reward robotic manipulation task with 35 valid semantic goals, including stacks of 2 or 3 objects and pyramids, our MIHM shows the fastest learning speed. Moreover, MIHM achieves the highest 0.9 total success rate, which is up to 0.6 in other methods. Throughout all the baseline methods, our MIHM is the only one that achieves to stack three objects.
format Online
Article
Text
id pubmed-10028088
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-100280882023-03-22 Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration Han, Changlin Peng, Zhiyong Liu, Yadong Tang, Jingsheng Yu, Yang Zhou, Zongtan Front Neurorobot Neuroscience Reinforcement learning (RL) empowers the agent to learn robotic manipulation skills autonomously. Compared with traditional single-goal RL, semantic-goal-conditioned RL expands the agent capacity to accomplish multiple semantic manipulation instructions. However, due to sparsely distributed semantic goals and sparse-reward agent-environment interactions, the hard exploration problem arises and impedes the agent training process. In traditional RL, curiosity-motivated exploration shows effectiveness in solving the hard exploration problem. However, in semantic-goal-conditioned RL, the performance of previous curiosity-motivated methods deteriorates, which we propose is because of their two defects: uncontrollability and distraction. To solve these defects, we propose a conservative curiosity-motivated method named mutual information motivation with hybrid policy mechanism (MIHM). MIHM mainly contributes two innovations: the decoupled-mutual-information-based intrinsic motivation, which prevents the agent from being motivated to explore dangerous states by uncontrollable curiosity; the precisely trained and automatically switched hybrid policy mechanism, which eliminates the distraction from the curiosity-motivated policy and achieves the optimal utilization of exploration and exploitation. Compared with four state-of-the-art curiosity-motivated methods in the sparse-reward robotic manipulation task with 35 valid semantic goals, including stacks of 2 or 3 objects and pyramids, our MIHM shows the fastest learning speed. Moreover, MIHM achieves the highest 0.9 total success rate, which is up to 0.6 in other methods. Throughout all the baseline methods, our MIHM is the only one that achieves to stack three objects. Frontiers Media S.A. 2023-03-07 /pmc/articles/PMC10028088/ /pubmed/36960195 http://dx.doi.org/10.3389/fnbot.2023.1089270 Text en Copyright © 2023 Han, Peng, Liu, Tang, Yu and Zhou. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Han, Changlin
Peng, Zhiyong
Liu, Yadong
Tang, Jingsheng
Yu, Yang
Zhou, Zongtan
Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration
title Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration
title_full Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration
title_fullStr Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration
title_full_unstemmed Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration
title_short Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration
title_sort learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10028088/
https://www.ncbi.nlm.nih.gov/pubmed/36960195
http://dx.doi.org/10.3389/fnbot.2023.1089270
work_keys_str_mv AT hanchanglin learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration
AT pengzhiyong learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration
AT liuyadong learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration
AT tangjingsheng learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration
AT yuyang learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration
AT zhouzongtan learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration