Cargando…

Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization

We present a general, two-stage reinforcement learning approach to create robust policies that can be deployed on real robots without any additional training using a single demonstration generated by trajectory optimization. The demonstration is used in the first stage as a starting point to facilit...

Descripción completa

Detalles Bibliográficos
Autores principales: Bogdanovic, Miroslav, Khadiv , Majid, Righetti , Ludovic
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9484268/
https://www.ncbi.nlm.nih.gov/pubmed/36134338
http://dx.doi.org/10.3389/frobt.2022.854212
_version_ 1784791846323486720
author Bogdanovic, Miroslav
Khadiv , Majid
Righetti , Ludovic
author_facet Bogdanovic, Miroslav
Khadiv , Majid
Righetti , Ludovic
author_sort Bogdanovic, Miroslav
collection PubMed
description We present a general, two-stage reinforcement learning approach to create robust policies that can be deployed on real robots without any additional training using a single demonstration generated by trajectory optimization. The demonstration is used in the first stage as a starting point to facilitate initial exploration. In the second stage, the relevant task reward is optimized directly and a policy robust to environment uncertainties is computed. We demonstrate and examine in detail the performance and robustness of our approach on highly dynamic hopping and bounding tasks on a quadruped robot.
format Online
Article
Text
id pubmed-9484268
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-94842682022-09-20 Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization Bogdanovic, Miroslav Khadiv , Majid Righetti , Ludovic Front Robot AI Robotics and AI We present a general, two-stage reinforcement learning approach to create robust policies that can be deployed on real robots without any additional training using a single demonstration generated by trajectory optimization. The demonstration is used in the first stage as a starting point to facilitate initial exploration. In the second stage, the relevant task reward is optimized directly and a policy robust to environment uncertainties is computed. We demonstrate and examine in detail the performance and robustness of our approach on highly dynamic hopping and bounding tasks on a quadruped robot. Frontiers Media S.A. 2022-08-31 /pmc/articles/PMC9484268/ /pubmed/36134338 http://dx.doi.org/10.3389/frobt.2022.854212 Text en Copyright © 2022 Bogdanovic, Khadiv  and Righetti . https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Robotics and AI
Bogdanovic, Miroslav
Khadiv , Majid
Righetti , Ludovic
Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
title Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
title_full Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
title_fullStr Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
title_full_unstemmed Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
title_short Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
title_sort model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
topic Robotics and AI
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9484268/
https://www.ncbi.nlm.nih.gov/pubmed/36134338
http://dx.doi.org/10.3389/frobt.2022.854212
work_keys_str_mv AT bogdanovicmiroslav modelfreereinforcementlearningforrobustlocomotionusingdemonstrationsfromtrajectoryoptimization
AT khadivmajid modelfreereinforcementlearningforrobustlocomotionusingdemonstrationsfromtrajectoryoptimization
AT righettiludovic modelfreereinforcementlearningforrobustlocomotionusingdemonstrationsfromtrajectoryoptimization