Cargando…

Sim-to-real via latent prediction: Transferring visual non-prehensile manipulation policies

Reinforcement Learning has been shown to have a great potential for robotics. It demonstrated the capability to solve complex manipulation and locomotion tasks, even by learning end-to-end policies that operate directly on visual input, removing the need for custom perception systems. However, for p...

Descripción completa

Detalles Bibliográficos
Autores principales:	Rizzardo, Carlo, Chen, Fei, Caldwell, Darwin
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2023
Materias:	Robotics and AI
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9879568/ https://www.ncbi.nlm.nih.gov/pubmed/36714802 http://dx.doi.org/10.3389/frobt.2022.1067502

_version_	1784878720831455232
author	Rizzardo, Carlo Chen, Fei Caldwell, Darwin
author_facet	Rizzardo, Carlo Chen, Fei Caldwell, Darwin
author_sort	Rizzardo, Carlo
collection	PubMed
description	Reinforcement Learning has been shown to have a great potential for robotics. It demonstrated the capability to solve complex manipulation and locomotion tasks, even by learning end-to-end policies that operate directly on visual input, removing the need for custom perception systems. However, for practical robotics applications, its scarce sample efficiency, the need for huge amounts of resources, data, and computation time can be an insurmountable obstacle. One potential solution to this sample efficiency issue is the use of simulated environments. However, the discrepancy in visual and physical characteristics between reality and simulation, namely the sim-to-real gap, often significantly reduces the real-world performance of policies trained within a simulator. In this work we propose a sim-to-real technique that trains a Soft-Actor Critic agent together with a decoupled feature extractor and a latent-space dynamics model. The decoupled nature of the method allows to independently perform the sim-to-real transfer of feature extractor and control policy, and the presence of the dynamics model acts as a constraint on the latent representation when finetuning the feature extractor on real-world data. We show how this architecture can allow the transfer of a trained agent from simulation to reality without retraining or finetuning the control policy, but using real-world data only for adapting the feature extractor. By avoiding training the control policy in the real domain we overcome the need to apply Reinforcement Learning on real-world data, instead, we only focus on the unsupervised training of the feature extractor, considerably reducing real-world experience collection requirements. We evaluate the method on sim-to-sim and sim-to-real transfer of a policy for table-top robotic object pushing. We demonstrate how the method is capable of adapting to considerable variations in the task observations, such as changes in point-of-view, colors, and lighting, all while substantially reducing the training time with respect to policies trained directly in the real.
format	Online Article Text
id	pubmed-9879568
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-98795682023-01-27 Sim-to-real via latent prediction: Transferring visual non-prehensile manipulation policies Rizzardo, Carlo Chen, Fei Caldwell, Darwin Front Robot AI Robotics and AI Reinforcement Learning has been shown to have a great potential for robotics. It demonstrated the capability to solve complex manipulation and locomotion tasks, even by learning end-to-end policies that operate directly on visual input, removing the need for custom perception systems. However, for practical robotics applications, its scarce sample efficiency, the need for huge amounts of resources, data, and computation time can be an insurmountable obstacle. One potential solution to this sample efficiency issue is the use of simulated environments. However, the discrepancy in visual and physical characteristics between reality and simulation, namely the sim-to-real gap, often significantly reduces the real-world performance of policies trained within a simulator. In this work we propose a sim-to-real technique that trains a Soft-Actor Critic agent together with a decoupled feature extractor and a latent-space dynamics model. The decoupled nature of the method allows to independently perform the sim-to-real transfer of feature extractor and control policy, and the presence of the dynamics model acts as a constraint on the latent representation when finetuning the feature extractor on real-world data. We show how this architecture can allow the transfer of a trained agent from simulation to reality without retraining or finetuning the control policy, but using real-world data only for adapting the feature extractor. By avoiding training the control policy in the real domain we overcome the need to apply Reinforcement Learning on real-world data, instead, we only focus on the unsupervised training of the feature extractor, considerably reducing real-world experience collection requirements. We evaluate the method on sim-to-sim and sim-to-real transfer of a policy for table-top robotic object pushing. We demonstrate how the method is capable of adapting to considerable variations in the task observations, such as changes in point-of-view, colors, and lighting, all while substantially reducing the training time with respect to policies trained directly in the real. Frontiers Media S.A. 2023-01-12 /pmc/articles/PMC9879568/ /pubmed/36714802 http://dx.doi.org/10.3389/frobt.2022.1067502 Text en Copyright © 2023 Rizzardo, Chen and Caldwell. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Robotics and AI Rizzardo, Carlo Chen, Fei Caldwell, Darwin Sim-to-real via latent prediction: Transferring visual non-prehensile manipulation policies
title	Sim-to-real via latent prediction: Transferring visual non-prehensile manipulation policies
title_full	Sim-to-real via latent prediction: Transferring visual non-prehensile manipulation policies
title_fullStr	Sim-to-real via latent prediction: Transferring visual non-prehensile manipulation policies
title_full_unstemmed	Sim-to-real via latent prediction: Transferring visual non-prehensile manipulation policies
title_short	Sim-to-real via latent prediction: Transferring visual non-prehensile manipulation policies
title_sort	sim-to-real via latent prediction: transferring visual non-prehensile manipulation policies
topic	Robotics and AI
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9879568/ https://www.ncbi.nlm.nih.gov/pubmed/36714802 http://dx.doi.org/10.3389/frobt.2022.1067502
work_keys_str_mv	AT rizzardocarlo simtorealvialatentpredictiontransferringvisualnonprehensilemanipulationpolicies AT chenfei simtorealvialatentpredictiontransferringvisualnonprehensilemanipulationpolicies AT caldwelldarwin simtorealvialatentpredictiontransferringvisualnonprehensilemanipulationpolicies

Sim-to-real via latent prediction: Transferring visual non-prehensile manipulation policies

Ejemplares similares