Cargando…

Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics

Locomotion control has long been vital to legged robots. Agile locomotion can be implemented through either model-based controller or reinforcement learning. It is proven that robust controllers can be obtained through model-based methods and learning-based policies have advantages in generalization...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wang, Zhicheng, Wei, Wandi, Xie, Anhuan, Zhang, Yifeng, Wu, Jun, Zhu, Qiuguo
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9611364/ https://www.ncbi.nlm.nih.gov/pubmed/36296041 http://dx.doi.org/10.3390/mi13101688

_version_	1784819507549700096
author	Wang, Zhicheng Wei, Wandi Xie, Anhuan Zhang, Yifeng Wu, Jun Zhu, Qiuguo
author_facet	Wang, Zhicheng Wei, Wandi Xie, Anhuan Zhang, Yifeng Wu, Jun Zhu, Qiuguo
author_sort	Wang, Zhicheng
collection	PubMed
description	Locomotion control has long been vital to legged robots. Agile locomotion can be implemented through either model-based controller or reinforcement learning. It is proven that robust controllers can be obtained through model-based methods and learning-based policies have advantages in generalization. This paper proposed a hybrid framework of locomotion controller that combines deep reinforcement learning and simple heuristic policy and assigns them to different activation phases, which provides guidance for adaptive training without producing conflicts between heuristic knowledge and learned policies. The training in simulation follows a step-by-step stochastic curriculum to guarantee success. Domain randomization during training and assistive extra feedback loops on real robot are also adopted to smooth the transition to the real world. Comparison experiments are carried out on both simulated and real Wukong-IV humanoid robots, and the proposed hybrid approach matches the canonical end-to-end approaches with higher rate of success, faster converging speed, and 60% less tracking error in velocity tracking tasks.
format	Online Article Text
id	pubmed-9611364
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-96113642022-10-28 Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics Wang, Zhicheng Wei, Wandi Xie, Anhuan Zhang, Yifeng Wu, Jun Zhu, Qiuguo Micromachines (Basel) Article Locomotion control has long been vital to legged robots. Agile locomotion can be implemented through either model-based controller or reinforcement learning. It is proven that robust controllers can be obtained through model-based methods and learning-based policies have advantages in generalization. This paper proposed a hybrid framework of locomotion controller that combines deep reinforcement learning and simple heuristic policy and assigns them to different activation phases, which provides guidance for adaptive training without producing conflicts between heuristic knowledge and learned policies. The training in simulation follows a step-by-step stochastic curriculum to guarantee success. Domain randomization during training and assistive extra feedback loops on real robot are also adopted to smooth the transition to the real world. Comparison experiments are carried out on both simulated and real Wukong-IV humanoid robots, and the proposed hybrid approach matches the canonical end-to-end approaches with higher rate of success, faster converging speed, and 60% less tracking error in velocity tracking tasks. MDPI 2022-10-07 /pmc/articles/PMC9611364/ /pubmed/36296041 http://dx.doi.org/10.3390/mi13101688 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Wang, Zhicheng Wei, Wandi Xie, Anhuan Zhang, Yifeng Wu, Jun Zhu, Qiuguo Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics
title	Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics
title_full	Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics
title_fullStr	Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics
title_full_unstemmed	Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics
title_short	Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics
title_sort	hybrid bipedal locomotion based on reinforcement learning and heuristics
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9611364/ https://www.ncbi.nlm.nih.gov/pubmed/36296041 http://dx.doi.org/10.3390/mi13101688
work_keys_str_mv	AT wangzhicheng hybridbipedallocomotionbasedonreinforcementlearningandheuristics AT weiwandi hybridbipedallocomotionbasedonreinforcementlearningandheuristics AT xieanhuan hybridbipedallocomotionbasedonreinforcementlearningandheuristics AT zhangyifeng hybridbipedallocomotionbasedonreinforcementlearningandheuristics AT wujun hybridbipedallocomotionbasedonreinforcementlearningandheuristics AT zhuqiuguo hybridbipedallocomotionbasedonreinforcementlearningandheuristics

Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics

Ejemplares similares