Cargando…
Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration
Reinforcement learning (RL) empowers the agent to learn robotic manipulation skills autonomously. Compared with traditional single-goal RL, semantic-goal-conditioned RL expands the agent capacity to accomplish multiple semantic manipulation instructions. However, due to sparsely distributed semantic...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10028088/ https://www.ncbi.nlm.nih.gov/pubmed/36960195 http://dx.doi.org/10.3389/fnbot.2023.1089270 |
_version_ | 1784909863958085632 |
---|---|
author | Han, Changlin Peng, Zhiyong Liu, Yadong Tang, Jingsheng Yu, Yang Zhou, Zongtan |
author_facet | Han, Changlin Peng, Zhiyong Liu, Yadong Tang, Jingsheng Yu, Yang Zhou, Zongtan |
author_sort | Han, Changlin |
collection | PubMed |
description | Reinforcement learning (RL) empowers the agent to learn robotic manipulation skills autonomously. Compared with traditional single-goal RL, semantic-goal-conditioned RL expands the agent capacity to accomplish multiple semantic manipulation instructions. However, due to sparsely distributed semantic goals and sparse-reward agent-environment interactions, the hard exploration problem arises and impedes the agent training process. In traditional RL, curiosity-motivated exploration shows effectiveness in solving the hard exploration problem. However, in semantic-goal-conditioned RL, the performance of previous curiosity-motivated methods deteriorates, which we propose is because of their two defects: uncontrollability and distraction. To solve these defects, we propose a conservative curiosity-motivated method named mutual information motivation with hybrid policy mechanism (MIHM). MIHM mainly contributes two innovations: the decoupled-mutual-information-based intrinsic motivation, which prevents the agent from being motivated to explore dangerous states by uncontrollable curiosity; the precisely trained and automatically switched hybrid policy mechanism, which eliminates the distraction from the curiosity-motivated policy and achieves the optimal utilization of exploration and exploitation. Compared with four state-of-the-art curiosity-motivated methods in the sparse-reward robotic manipulation task with 35 valid semantic goals, including stacks of 2 or 3 objects and pyramids, our MIHM shows the fastest learning speed. Moreover, MIHM achieves the highest 0.9 total success rate, which is up to 0.6 in other methods. Throughout all the baseline methods, our MIHM is the only one that achieves to stack three objects. |
format | Online Article Text |
id | pubmed-10028088 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-100280882023-03-22 Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration Han, Changlin Peng, Zhiyong Liu, Yadong Tang, Jingsheng Yu, Yang Zhou, Zongtan Front Neurorobot Neuroscience Reinforcement learning (RL) empowers the agent to learn robotic manipulation skills autonomously. Compared with traditional single-goal RL, semantic-goal-conditioned RL expands the agent capacity to accomplish multiple semantic manipulation instructions. However, due to sparsely distributed semantic goals and sparse-reward agent-environment interactions, the hard exploration problem arises and impedes the agent training process. In traditional RL, curiosity-motivated exploration shows effectiveness in solving the hard exploration problem. However, in semantic-goal-conditioned RL, the performance of previous curiosity-motivated methods deteriorates, which we propose is because of their two defects: uncontrollability and distraction. To solve these defects, we propose a conservative curiosity-motivated method named mutual information motivation with hybrid policy mechanism (MIHM). MIHM mainly contributes two innovations: the decoupled-mutual-information-based intrinsic motivation, which prevents the agent from being motivated to explore dangerous states by uncontrollable curiosity; the precisely trained and automatically switched hybrid policy mechanism, which eliminates the distraction from the curiosity-motivated policy and achieves the optimal utilization of exploration and exploitation. Compared with four state-of-the-art curiosity-motivated methods in the sparse-reward robotic manipulation task with 35 valid semantic goals, including stacks of 2 or 3 objects and pyramids, our MIHM shows the fastest learning speed. Moreover, MIHM achieves the highest 0.9 total success rate, which is up to 0.6 in other methods. Throughout all the baseline methods, our MIHM is the only one that achieves to stack three objects. Frontiers Media S.A. 2023-03-07 /pmc/articles/PMC10028088/ /pubmed/36960195 http://dx.doi.org/10.3389/fnbot.2023.1089270 Text en Copyright © 2023 Han, Peng, Liu, Tang, Yu and Zhou. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Neuroscience Han, Changlin Peng, Zhiyong Liu, Yadong Tang, Jingsheng Yu, Yang Zhou, Zongtan Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration |
title | Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration |
title_full | Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration |
title_fullStr | Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration |
title_full_unstemmed | Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration |
title_short | Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration |
title_sort | learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration |
topic | Neuroscience |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10028088/ https://www.ncbi.nlm.nih.gov/pubmed/36960195 http://dx.doi.org/10.3389/fnbot.2023.1089270 |
work_keys_str_mv | AT hanchanglin learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration AT pengzhiyong learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration AT liuyadong learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration AT tangjingsheng learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration AT yuyang learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration AT zhouzongtan learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration |