Cargando…
Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics
Locomotion control has long been vital to legged robots. Agile locomotion can be implemented through either model-based controller or reinforcement learning. It is proven that robust controllers can be obtained through model-based methods and learning-based policies have advantages in generalization...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9611364/ https://www.ncbi.nlm.nih.gov/pubmed/36296041 http://dx.doi.org/10.3390/mi13101688 |
_version_ | 1784819507549700096 |
---|---|
author | Wang, Zhicheng Wei, Wandi Xie, Anhuan Zhang, Yifeng Wu, Jun Zhu, Qiuguo |
author_facet | Wang, Zhicheng Wei, Wandi Xie, Anhuan Zhang, Yifeng Wu, Jun Zhu, Qiuguo |
author_sort | Wang, Zhicheng |
collection | PubMed |
description | Locomotion control has long been vital to legged robots. Agile locomotion can be implemented through either model-based controller or reinforcement learning. It is proven that robust controllers can be obtained through model-based methods and learning-based policies have advantages in generalization. This paper proposed a hybrid framework of locomotion controller that combines deep reinforcement learning and simple heuristic policy and assigns them to different activation phases, which provides guidance for adaptive training without producing conflicts between heuristic knowledge and learned policies. The training in simulation follows a step-by-step stochastic curriculum to guarantee success. Domain randomization during training and assistive extra feedback loops on real robot are also adopted to smooth the transition to the real world. Comparison experiments are carried out on both simulated and real Wukong-IV humanoid robots, and the proposed hybrid approach matches the canonical end-to-end approaches with higher rate of success, faster converging speed, and 60% less tracking error in velocity tracking tasks. |
format | Online Article Text |
id | pubmed-9611364 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-96113642022-10-28 Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics Wang, Zhicheng Wei, Wandi Xie, Anhuan Zhang, Yifeng Wu, Jun Zhu, Qiuguo Micromachines (Basel) Article Locomotion control has long been vital to legged robots. Agile locomotion can be implemented through either model-based controller or reinforcement learning. It is proven that robust controllers can be obtained through model-based methods and learning-based policies have advantages in generalization. This paper proposed a hybrid framework of locomotion controller that combines deep reinforcement learning and simple heuristic policy and assigns them to different activation phases, which provides guidance for adaptive training without producing conflicts between heuristic knowledge and learned policies. The training in simulation follows a step-by-step stochastic curriculum to guarantee success. Domain randomization during training and assistive extra feedback loops on real robot are also adopted to smooth the transition to the real world. Comparison experiments are carried out on both simulated and real Wukong-IV humanoid robots, and the proposed hybrid approach matches the canonical end-to-end approaches with higher rate of success, faster converging speed, and 60% less tracking error in velocity tracking tasks. MDPI 2022-10-07 /pmc/articles/PMC9611364/ /pubmed/36296041 http://dx.doi.org/10.3390/mi13101688 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Wang, Zhicheng Wei, Wandi Xie, Anhuan Zhang, Yifeng Wu, Jun Zhu, Qiuguo Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics |
title | Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics |
title_full | Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics |
title_fullStr | Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics |
title_full_unstemmed | Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics |
title_short | Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics |
title_sort | hybrid bipedal locomotion based on reinforcement learning and heuristics |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9611364/ https://www.ncbi.nlm.nih.gov/pubmed/36296041 http://dx.doi.org/10.3390/mi13101688 |
work_keys_str_mv | AT wangzhicheng hybridbipedallocomotionbasedonreinforcementlearningandheuristics AT weiwandi hybridbipedallocomotionbasedonreinforcementlearningandheuristics AT xieanhuan hybridbipedallocomotionbasedonreinforcementlearningandheuristics AT zhangyifeng hybridbipedallocomotionbasedonreinforcementlearningandheuristics AT wujun hybridbipedallocomotionbasedonreinforcementlearningandheuristics AT zhuqiuguo hybridbipedallocomotionbasedonreinforcementlearningandheuristics |