Cargando…

Prehensile and Non-Prehensile Robotic Pick-and-Place of Objects in Clutter Using Deep Reinforcement Learning

In this study, we develop a framework for an intelligent and self-supervised industrial pick-and-place operation for cluttered environments. Our target is to have the agent learn to perform prehensile and non-prehensile robotic manipulations to improve the efficiency and throughput of the pick-and-p...

Descripción completa

Detalles Bibliográficos
Autores principales: Imtiaz, Muhammad Babar, Qiao, Yuansong, Lee, Brian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9919624/
https://www.ncbi.nlm.nih.gov/pubmed/36772553
http://dx.doi.org/10.3390/s23031513
_version_ 1784886868855226368
author Imtiaz, Muhammad Babar
Qiao, Yuansong
Lee, Brian
author_facet Imtiaz, Muhammad Babar
Qiao, Yuansong
Lee, Brian
author_sort Imtiaz, Muhammad Babar
collection PubMed
description In this study, we develop a framework for an intelligent and self-supervised industrial pick-and-place operation for cluttered environments. Our target is to have the agent learn to perform prehensile and non-prehensile robotic manipulations to improve the efficiency and throughput of the pick-and-place task. To achieve this target, we specify the problem as a Markov decision process (MDP) and deploy a deep reinforcement learning (RL) temporal difference model-free algorithm known as the deep Q-network (DQN). We consider three actions in our MDP; one is ‘grasping’ from the prehensile manipulation category and the other two are ‘left-slide’ and ‘right-slide’ from the non-prehensile manipulation category. Our DQN is composed of three fully convolutional networks (FCN) based on the memory-efficient architecture of DenseNet-121 which are trained together without causing any bottleneck situations. Each FCN corresponds to each discrete action and outputs a pixel-wise map of affordances for the relevant action. Rewards are allocated after every forward pass and backpropagation is carried out for weight tuning in the corresponding FCN. In this manner, non-prehensile manipulations are learnt which can, in turn, lead to possible successful prehensile manipulations in the near future and vice versa, thus increasing the efficiency and throughput of the pick-and-place task. The Results section shows performance comparisons of our approach to a baseline deep learning approach and a ResNet architecture-based approach, along with very promising test results at varying clutter densities across a range of complex scenario test cases.
format Online
Article
Text
id pubmed-9919624
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-99196242023-02-12 Prehensile and Non-Prehensile Robotic Pick-and-Place of Objects in Clutter Using Deep Reinforcement Learning Imtiaz, Muhammad Babar Qiao, Yuansong Lee, Brian Sensors (Basel) Article In this study, we develop a framework for an intelligent and self-supervised industrial pick-and-place operation for cluttered environments. Our target is to have the agent learn to perform prehensile and non-prehensile robotic manipulations to improve the efficiency and throughput of the pick-and-place task. To achieve this target, we specify the problem as a Markov decision process (MDP) and deploy a deep reinforcement learning (RL) temporal difference model-free algorithm known as the deep Q-network (DQN). We consider three actions in our MDP; one is ‘grasping’ from the prehensile manipulation category and the other two are ‘left-slide’ and ‘right-slide’ from the non-prehensile manipulation category. Our DQN is composed of three fully convolutional networks (FCN) based on the memory-efficient architecture of DenseNet-121 which are trained together without causing any bottleneck situations. Each FCN corresponds to each discrete action and outputs a pixel-wise map of affordances for the relevant action. Rewards are allocated after every forward pass and backpropagation is carried out for weight tuning in the corresponding FCN. In this manner, non-prehensile manipulations are learnt which can, in turn, lead to possible successful prehensile manipulations in the near future and vice versa, thus increasing the efficiency and throughput of the pick-and-place task. The Results section shows performance comparisons of our approach to a baseline deep learning approach and a ResNet architecture-based approach, along with very promising test results at varying clutter densities across a range of complex scenario test cases. MDPI 2023-01-29 /pmc/articles/PMC9919624/ /pubmed/36772553 http://dx.doi.org/10.3390/s23031513 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Imtiaz, Muhammad Babar
Qiao, Yuansong
Lee, Brian
Prehensile and Non-Prehensile Robotic Pick-and-Place of Objects in Clutter Using Deep Reinforcement Learning
title Prehensile and Non-Prehensile Robotic Pick-and-Place of Objects in Clutter Using Deep Reinforcement Learning
title_full Prehensile and Non-Prehensile Robotic Pick-and-Place of Objects in Clutter Using Deep Reinforcement Learning
title_fullStr Prehensile and Non-Prehensile Robotic Pick-and-Place of Objects in Clutter Using Deep Reinforcement Learning
title_full_unstemmed Prehensile and Non-Prehensile Robotic Pick-and-Place of Objects in Clutter Using Deep Reinforcement Learning
title_short Prehensile and Non-Prehensile Robotic Pick-and-Place of Objects in Clutter Using Deep Reinforcement Learning
title_sort prehensile and non-prehensile robotic pick-and-place of objects in clutter using deep reinforcement learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9919624/
https://www.ncbi.nlm.nih.gov/pubmed/36772553
http://dx.doi.org/10.3390/s23031513
work_keys_str_mv AT imtiazmuhammadbabar prehensileandnonprehensileroboticpickandplaceofobjectsinclutterusingdeepreinforcementlearning
AT qiaoyuansong prehensileandnonprehensileroboticpickandplaceofobjectsinclutterusingdeepreinforcementlearning
AT leebrian prehensileandnonprehensileroboticpickandplaceofobjectsinclutterusingdeepreinforcementlearning