Cargando…

Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics

Locomotion control has long been vital to legged robots. Agile locomotion can be implemented through either model-based controller or reinforcement learning. It is proven that robust controllers can be obtained through model-based methods and learning-based policies have advantages in generalization...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Zhicheng, Wei, Wandi, Xie, Anhuan, Zhang, Yifeng, Wu, Jun, Zhu, Qiuguo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9611364/
https://www.ncbi.nlm.nih.gov/pubmed/36296041
http://dx.doi.org/10.3390/mi13101688
_version_ 1784819507549700096
author Wang, Zhicheng
Wei, Wandi
Xie, Anhuan
Zhang, Yifeng
Wu, Jun
Zhu, Qiuguo
author_facet Wang, Zhicheng
Wei, Wandi
Xie, Anhuan
Zhang, Yifeng
Wu, Jun
Zhu, Qiuguo
author_sort Wang, Zhicheng
collection PubMed
description Locomotion control has long been vital to legged robots. Agile locomotion can be implemented through either model-based controller or reinforcement learning. It is proven that robust controllers can be obtained through model-based methods and learning-based policies have advantages in generalization. This paper proposed a hybrid framework of locomotion controller that combines deep reinforcement learning and simple heuristic policy and assigns them to different activation phases, which provides guidance for adaptive training without producing conflicts between heuristic knowledge and learned policies. The training in simulation follows a step-by-step stochastic curriculum to guarantee success. Domain randomization during training and assistive extra feedback loops on real robot are also adopted to smooth the transition to the real world. Comparison experiments are carried out on both simulated and real Wukong-IV humanoid robots, and the proposed hybrid approach matches the canonical end-to-end approaches with higher rate of success, faster converging speed, and 60% less tracking error in velocity tracking tasks.
format Online
Article
Text
id pubmed-9611364
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-96113642022-10-28 Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics Wang, Zhicheng Wei, Wandi Xie, Anhuan Zhang, Yifeng Wu, Jun Zhu, Qiuguo Micromachines (Basel) Article Locomotion control has long been vital to legged robots. Agile locomotion can be implemented through either model-based controller or reinforcement learning. It is proven that robust controllers can be obtained through model-based methods and learning-based policies have advantages in generalization. This paper proposed a hybrid framework of locomotion controller that combines deep reinforcement learning and simple heuristic policy and assigns them to different activation phases, which provides guidance for adaptive training without producing conflicts between heuristic knowledge and learned policies. The training in simulation follows a step-by-step stochastic curriculum to guarantee success. Domain randomization during training and assistive extra feedback loops on real robot are also adopted to smooth the transition to the real world. Comparison experiments are carried out on both simulated and real Wukong-IV humanoid robots, and the proposed hybrid approach matches the canonical end-to-end approaches with higher rate of success, faster converging speed, and 60% less tracking error in velocity tracking tasks. MDPI 2022-10-07 /pmc/articles/PMC9611364/ /pubmed/36296041 http://dx.doi.org/10.3390/mi13101688 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wang, Zhicheng
Wei, Wandi
Xie, Anhuan
Zhang, Yifeng
Wu, Jun
Zhu, Qiuguo
Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics
title Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics
title_full Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics
title_fullStr Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics
title_full_unstemmed Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics
title_short Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics
title_sort hybrid bipedal locomotion based on reinforcement learning and heuristics
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9611364/
https://www.ncbi.nlm.nih.gov/pubmed/36296041
http://dx.doi.org/10.3390/mi13101688
work_keys_str_mv AT wangzhicheng hybridbipedallocomotionbasedonreinforcementlearningandheuristics
AT weiwandi hybridbipedallocomotionbasedonreinforcementlearningandheuristics
AT xieanhuan hybridbipedallocomotionbasedonreinforcementlearningandheuristics
AT zhangyifeng hybridbipedallocomotionbasedonreinforcementlearningandheuristics
AT wujun hybridbipedallocomotionbasedonreinforcementlearningandheuristics
AT zhuqiuguo hybridbipedallocomotionbasedonreinforcementlearningandheuristics