Cargando…

The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place

This paper proposes a task decomposition and dedicated reward-system-based reinforcement learning algorithm for the Pick-and-Place task, which is one of the high-level tasks of robot manipulators. The proposed method decomposes the Pick-and-Place task into three subtasks: two reaching tasks and one...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kim, Byeongjun, Kwon, Gunam, Park, Chaneun, Kwon, Nam Kyu
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10296071/ https://www.ncbi.nlm.nih.gov/pubmed/37366835 http://dx.doi.org/10.3390/biomimetics8020240

_version_	1785063570971557888
author	Kim, Byeongjun Kwon, Gunam Park, Chaneun Kwon, Nam Kyu
author_facet	Kim, Byeongjun Kwon, Gunam Park, Chaneun Kwon, Nam Kyu
author_sort	Kim, Byeongjun
collection	PubMed
description	This paper proposes a task decomposition and dedicated reward-system-based reinforcement learning algorithm for the Pick-and-Place task, which is one of the high-level tasks of robot manipulators. The proposed method decomposes the Pick-and-Place task into three subtasks: two reaching tasks and one grasping task. One of the two reaching tasks is approaching the object, and the other is reaching the place position. These two reaching tasks are carried out using each optimal policy of the agents which are trained using Soft Actor-Critic (SAC). Different from the two reaching tasks, the grasping is implemented via simple logic which is easily designable but may result in improper gripping. To assist the grasping task properly, a dedicated reward system for approaching the object is designed through using individual axis-based weights. To verify the validity of the proposed method, wecarry out various experiments in the MuJoCo physics engine with the Robosuite framework. According to the simulation results of four trials, the robot manipulator picked up and released the object in the goal position with an average success rate of 93.2%.
format	Online Article Text
id	pubmed-10296071
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-102960712023-06-28 The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place Kim, Byeongjun Kwon, Gunam Park, Chaneun Kwon, Nam Kyu Biomimetics (Basel) Article This paper proposes a task decomposition and dedicated reward-system-based reinforcement learning algorithm for the Pick-and-Place task, which is one of the high-level tasks of robot manipulators. The proposed method decomposes the Pick-and-Place task into three subtasks: two reaching tasks and one grasping task. One of the two reaching tasks is approaching the object, and the other is reaching the place position. These two reaching tasks are carried out using each optimal policy of the agents which are trained using Soft Actor-Critic (SAC). Different from the two reaching tasks, the grasping is implemented via simple logic which is easily designable but may result in improper gripping. To assist the grasping task properly, a dedicated reward system for approaching the object is designed through using individual axis-based weights. To verify the validity of the proposed method, wecarry out various experiments in the MuJoCo physics engine with the Robosuite framework. According to the simulation results of four trials, the robot manipulator picked up and released the object in the goal position with an average success rate of 93.2%. MDPI 2023-06-06 /pmc/articles/PMC10296071/ /pubmed/37366835 http://dx.doi.org/10.3390/biomimetics8020240 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Kim, Byeongjun Kwon, Gunam Park, Chaneun Kwon, Nam Kyu The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
title	The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
title_full	The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
title_fullStr	The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
title_full_unstemmed	The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
title_short	The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
title_sort	task decomposition and dedicated reward-system-based reinforcement learning algorithm for pick-and-place
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10296071/ https://www.ncbi.nlm.nih.gov/pubmed/37366835 http://dx.doi.org/10.3390/biomimetics8020240
work_keys_str_mv	AT kimbyeongjun thetaskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace AT kwongunam thetaskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace AT parkchaneun thetaskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace AT kwonnamkyu thetaskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace AT kimbyeongjun taskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace AT kwongunam taskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace AT parkchaneun taskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace AT kwonnamkyu taskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace

The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place

Ejemplares similares