Cargando…

Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization

We present a general, two-stage reinforcement learning approach to create robust policies that can be deployed on real robots without any additional training using a single demonstration generated by trajectory optimization. The demonstration is used in the first stage as a starting point to facilit...

Descripción completa

Detalles Bibliográficos
Autores principales:	Bogdanovic, Miroslav, Khadiv , Majid, Righetti , Ludovic
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2022
Materias:	Robotics and AI
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9484268/ https://www.ncbi.nlm.nih.gov/pubmed/36134338 http://dx.doi.org/10.3389/frobt.2022.854212

_version_	1784791846323486720
author	Bogdanovic, Miroslav Khadiv , Majid Righetti , Ludovic
author_facet	Bogdanovic, Miroslav Khadiv , Majid Righetti , Ludovic
author_sort	Bogdanovic, Miroslav
collection	PubMed
description	We present a general, two-stage reinforcement learning approach to create robust policies that can be deployed on real robots without any additional training using a single demonstration generated by trajectory optimization. The demonstration is used in the first stage as a starting point to facilitate initial exploration. In the second stage, the relevant task reward is optimized directly and a policy robust to environment uncertainties is computed. We demonstrate and examine in detail the performance and robustness of our approach on highly dynamic hopping and bounding tasks on a quadruped robot.
format	Online Article Text
id	pubmed-9484268
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-94842682022-09-20 Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization Bogdanovic, Miroslav Khadiv , Majid Righetti , Ludovic Front Robot AI Robotics and AI We present a general, two-stage reinforcement learning approach to create robust policies that can be deployed on real robots without any additional training using a single demonstration generated by trajectory optimization. The demonstration is used in the first stage as a starting point to facilitate initial exploration. In the second stage, the relevant task reward is optimized directly and a policy robust to environment uncertainties is computed. We demonstrate and examine in detail the performance and robustness of our approach on highly dynamic hopping and bounding tasks on a quadruped robot. Frontiers Media S.A. 2022-08-31 /pmc/articles/PMC9484268/ /pubmed/36134338 http://dx.doi.org/10.3389/frobt.2022.854212 Text en Copyright © 2022 Bogdanovic, Khadiv and Righetti . https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Robotics and AI Bogdanovic, Miroslav Khadiv , Majid Righetti , Ludovic Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
title	Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
title_full	Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
title_fullStr	Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
title_full_unstemmed	Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
title_short	Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
title_sort	model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
topic	Robotics and AI
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9484268/ https://www.ncbi.nlm.nih.gov/pubmed/36134338 http://dx.doi.org/10.3389/frobt.2022.854212
work_keys_str_mv	AT bogdanovicmiroslav modelfreereinforcementlearningforrobustlocomotionusingdemonstrationsfromtrajectoryoptimization AT khadivmajid modelfreereinforcementlearningforrobustlocomotionusingdemonstrationsfromtrajectoryoptimization AT righettiludovic modelfreereinforcementlearningforrobustlocomotionusingdemonstrationsfromtrajectoryoptimization

Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization

Ejemplares similares