Cargando…

Behavior policy learning: Learning multi-stage tasks via solution sketches and model-based controllers

Multi-stage tasks are a challenge for reinforcement learning methods, and require either specific task knowledge (e.g., task segmentation) or big amount of interaction times to be learned. In this paper, we propose Behavior Policy Learning (BPL) that effectively combines 1) only few solution sketche...

Descripción completa

Detalles Bibliográficos
Autores principales:	Tsinganos, Konstantinos, Chatzilygeroudis, Konstantinos, Hadjivelichkov, Denis, Komninos, Theodoros, Dermatas, Evangelos, Kanoulas, Dimitrios
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2022
Materias:	Robotics and AI
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9597635/ https://www.ncbi.nlm.nih.gov/pubmed/36313244 http://dx.doi.org/10.3389/frobt.2022.974537

_version_	1784816138763370496
author	Tsinganos, Konstantinos Chatzilygeroudis, Konstantinos Hadjivelichkov, Denis Komninos, Theodoros Dermatas, Evangelos Kanoulas, Dimitrios
author_facet	Tsinganos, Konstantinos Chatzilygeroudis, Konstantinos Hadjivelichkov, Denis Komninos, Theodoros Dermatas, Evangelos Kanoulas, Dimitrios
author_sort	Tsinganos, Konstantinos
collection	PubMed
description	Multi-stage tasks are a challenge for reinforcement learning methods, and require either specific task knowledge (e.g., task segmentation) or big amount of interaction times to be learned. In this paper, we propose Behavior Policy Learning (BPL) that effectively combines 1) only few solution sketches, that is demonstrations without the actions, but only the states, 2) model-based controllers, and 3) simulations to effectively solve multi-stage tasks without strong knowledge about the underlying task. Our main intuition is that solution sketches alone can provide strong data for learning a high-level trajectory by imitation, and model-based controllers can be used to follow this trajectory (we call it behavior) effectively. Finally, we utilize robotic simulations to further improve the policy and make it robust in a Sim2Real style. We evaluate our method in simulation with a robotic manipulator that has to perform two tasks with variations: 1) grasp a box and place it in a basket, and 2) re-place a book on a different level within a bookcase. We also validate the Sim2Real capabilities of our method by performing real-world experiments and realistic simulated experiments where the objects are tracked through an RGB-D camera for the first task.
format	Online Article Text
id	pubmed-9597635
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-95976352022-10-27 Behavior policy learning: Learning multi-stage tasks via solution sketches and model-based controllers Tsinganos, Konstantinos Chatzilygeroudis, Konstantinos Hadjivelichkov, Denis Komninos, Theodoros Dermatas, Evangelos Kanoulas, Dimitrios Front Robot AI Robotics and AI Multi-stage tasks are a challenge for reinforcement learning methods, and require either specific task knowledge (e.g., task segmentation) or big amount of interaction times to be learned. In this paper, we propose Behavior Policy Learning (BPL) that effectively combines 1) only few solution sketches, that is demonstrations without the actions, but only the states, 2) model-based controllers, and 3) simulations to effectively solve multi-stage tasks without strong knowledge about the underlying task. Our main intuition is that solution sketches alone can provide strong data for learning a high-level trajectory by imitation, and model-based controllers can be used to follow this trajectory (we call it behavior) effectively. Finally, we utilize robotic simulations to further improve the policy and make it robust in a Sim2Real style. We evaluate our method in simulation with a robotic manipulator that has to perform two tasks with variations: 1) grasp a box and place it in a basket, and 2) re-place a book on a different level within a bookcase. We also validate the Sim2Real capabilities of our method by performing real-world experiments and realistic simulated experiments where the objects are tracked through an RGB-D camera for the first task. Frontiers Media S.A. 2022-10-12 /pmc/articles/PMC9597635/ /pubmed/36313244 http://dx.doi.org/10.3389/frobt.2022.974537 Text en Copyright © 2022 Tsinganos, Chatzilygeroudis, Hadjivelichkov, Komninos, Dermatas and Kanoulas. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Robotics and AI Tsinganos, Konstantinos Chatzilygeroudis, Konstantinos Hadjivelichkov, Denis Komninos, Theodoros Dermatas, Evangelos Kanoulas, Dimitrios Behavior policy learning: Learning multi-stage tasks via solution sketches and model-based controllers
title	Behavior policy learning: Learning multi-stage tasks via solution sketches and model-based controllers
title_full	Behavior policy learning: Learning multi-stage tasks via solution sketches and model-based controllers
title_fullStr	Behavior policy learning: Learning multi-stage tasks via solution sketches and model-based controllers
title_full_unstemmed	Behavior policy learning: Learning multi-stage tasks via solution sketches and model-based controllers
title_short	Behavior policy learning: Learning multi-stage tasks via solution sketches and model-based controllers
title_sort	behavior policy learning: learning multi-stage tasks via solution sketches and model-based controllers
topic	Robotics and AI
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9597635/ https://www.ncbi.nlm.nih.gov/pubmed/36313244 http://dx.doi.org/10.3389/frobt.2022.974537
work_keys_str_mv	AT tsinganoskonstantinos behaviorpolicylearninglearningmultistagetasksviasolutionsketchesandmodelbasedcontrollers AT chatzilygeroudiskonstantinos behaviorpolicylearninglearningmultistagetasksviasolutionsketchesandmodelbasedcontrollers AT hadjivelichkovdenis behaviorpolicylearninglearningmultistagetasksviasolutionsketchesandmodelbasedcontrollers AT komninostheodoros behaviorpolicylearninglearningmultistagetasksviasolutionsketchesandmodelbasedcontrollers AT dermatasevangelos behaviorpolicylearninglearningmultistagetasksviasolutionsketchesandmodelbasedcontrollers AT kanoulasdimitrios behaviorpolicylearninglearningmultistagetasksviasolutionsketchesandmodelbasedcontrollers

Behavior policy learning: Learning multi-stage tasks via solution sketches and model-based controllers

Ejemplares similares