Cargando…

Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration

Reinforcement learning (RL) empowers the agent to learn robotic manipulation skills autonomously. Compared with traditional single-goal RL, semantic-goal-conditioned RL expands the agent capacity to accomplish multiple semantic manipulation instructions. However, due to sparsely distributed semantic...

Descripción completa

Detalles Bibliográficos
Autores principales:	Han, Changlin, Peng, Zhiyong, Liu, Yadong, Tang, Jingsheng, Yu, Yang, Zhou, Zongtan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2023
Materias:	Neuroscience
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10028088/ https://www.ncbi.nlm.nih.gov/pubmed/36960195 http://dx.doi.org/10.3389/fnbot.2023.1089270

_version_	1784909863958085632
author	Han, Changlin Peng, Zhiyong Liu, Yadong Tang, Jingsheng Yu, Yang Zhou, Zongtan
author_facet	Han, Changlin Peng, Zhiyong Liu, Yadong Tang, Jingsheng Yu, Yang Zhou, Zongtan
author_sort	Han, Changlin
collection	PubMed
description	Reinforcement learning (RL) empowers the agent to learn robotic manipulation skills autonomously. Compared with traditional single-goal RL, semantic-goal-conditioned RL expands the agent capacity to accomplish multiple semantic manipulation instructions. However, due to sparsely distributed semantic goals and sparse-reward agent-environment interactions, the hard exploration problem arises and impedes the agent training process. In traditional RL, curiosity-motivated exploration shows effectiveness in solving the hard exploration problem. However, in semantic-goal-conditioned RL, the performance of previous curiosity-motivated methods deteriorates, which we propose is because of their two defects: uncontrollability and distraction. To solve these defects, we propose a conservative curiosity-motivated method named mutual information motivation with hybrid policy mechanism (MIHM). MIHM mainly contributes two innovations: the decoupled-mutual-information-based intrinsic motivation, which prevents the agent from being motivated to explore dangerous states by uncontrollable curiosity; the precisely trained and automatically switched hybrid policy mechanism, which eliminates the distraction from the curiosity-motivated policy and achieves the optimal utilization of exploration and exploitation. Compared with four state-of-the-art curiosity-motivated methods in the sparse-reward robotic manipulation task with 35 valid semantic goals, including stacks of 2 or 3 objects and pyramids, our MIHM shows the fastest learning speed. Moreover, MIHM achieves the highest 0.9 total success rate, which is up to 0.6 in other methods. Throughout all the baseline methods, our MIHM is the only one that achieves to stack three objects.
format	Online Article Text
id	pubmed-10028088
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-100280882023-03-22 Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration Han, Changlin Peng, Zhiyong Liu, Yadong Tang, Jingsheng Yu, Yang Zhou, Zongtan Front Neurorobot Neuroscience Reinforcement learning (RL) empowers the agent to learn robotic manipulation skills autonomously. Compared with traditional single-goal RL, semantic-goal-conditioned RL expands the agent capacity to accomplish multiple semantic manipulation instructions. However, due to sparsely distributed semantic goals and sparse-reward agent-environment interactions, the hard exploration problem arises and impedes the agent training process. In traditional RL, curiosity-motivated exploration shows effectiveness in solving the hard exploration problem. However, in semantic-goal-conditioned RL, the performance of previous curiosity-motivated methods deteriorates, which we propose is because of their two defects: uncontrollability and distraction. To solve these defects, we propose a conservative curiosity-motivated method named mutual information motivation with hybrid policy mechanism (MIHM). MIHM mainly contributes two innovations: the decoupled-mutual-information-based intrinsic motivation, which prevents the agent from being motivated to explore dangerous states by uncontrollable curiosity; the precisely trained and automatically switched hybrid policy mechanism, which eliminates the distraction from the curiosity-motivated policy and achieves the optimal utilization of exploration and exploitation. Compared with four state-of-the-art curiosity-motivated methods in the sparse-reward robotic manipulation task with 35 valid semantic goals, including stacks of 2 or 3 objects and pyramids, our MIHM shows the fastest learning speed. Moreover, MIHM achieves the highest 0.9 total success rate, which is up to 0.6 in other methods. Throughout all the baseline methods, our MIHM is the only one that achieves to stack three objects. Frontiers Media S.A. 2023-03-07 /pmc/articles/PMC10028088/ /pubmed/36960195 http://dx.doi.org/10.3389/fnbot.2023.1089270 Text en Copyright © 2023 Han, Peng, Liu, Tang, Yu and Zhou. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Neuroscience Han, Changlin Peng, Zhiyong Liu, Yadong Tang, Jingsheng Yu, Yang Zhou, Zongtan Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration
title	Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration
title_full	Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration
title_fullStr	Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration
title_full_unstemmed	Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration
title_short	Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration
title_sort	learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration
topic	Neuroscience
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10028088/ https://www.ncbi.nlm.nih.gov/pubmed/36960195 http://dx.doi.org/10.3389/fnbot.2023.1089270
work_keys_str_mv	AT hanchanglin learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration AT pengzhiyong learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration AT liuyadong learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration AT tangjingsheng learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration AT yuyang learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration AT zhouzongtan learningroboticmanipulationskillswithmultiplesemanticgoalsbyconservativecuriositymotivatedexploration

Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration

Ejemplares similares