Cargando…

Implementation of Deep Deterministic Policy Gradients for Controlling Dynamic Bipedal Walking

A control system for bipedal walking in the sagittal plane was developed in simulation. The biped model was built based on anthropometric data for a 1.8 m tall male of average build. At the core of the controller is a deep deterministic policy gradient (DDPG) neural network that was trained in GAZEB...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Chujun, Lonsberry, Andrew G., Nandor, Mark J., Audu, Musa L., Lonsberry, Alexander J., Quinn, Roger D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6477666/
https://www.ncbi.nlm.nih.gov/pubmed/31105213
http://dx.doi.org/10.3390/biomimetics4010028
_version_ 1783413058315485184
author Liu, Chujun
Lonsberry, Andrew G.
Nandor, Mark J.
Audu, Musa L.
Lonsberry, Alexander J.
Quinn, Roger D.
author_facet Liu, Chujun
Lonsberry, Andrew G.
Nandor, Mark J.
Audu, Musa L.
Lonsberry, Alexander J.
Quinn, Roger D.
author_sort Liu, Chujun
collection PubMed
description A control system for bipedal walking in the sagittal plane was developed in simulation. The biped model was built based on anthropometric data for a 1.8 m tall male of average build. At the core of the controller is a deep deterministic policy gradient (DDPG) neural network that was trained in GAZEBO, a physics simulator, to predict the ideal foot placement to maintain stable walking despite external disturbances. The complexity of the DDPG network was decreased through carefully selected state variables and a distributed control system. Additional controllers for the hip joints during their stance phases and the ankle joint during toe-off phase help to stabilize the biped during walking. The simulated biped can walk at a steady pace of approximately 1 m/s, and during locomotion it can maintain stability with a 30 kg·m/s impulse applied forward on the torso or a 40 kg·m/s impulse applied rearward. It also maintains stable walking with a 10 kg backpack or a 25 kg front pack. The controller was trained on a 1.8 m tall model, but also stabilizes models 1.4–2.3 m tall with no changes.
format Online
Article
Text
id pubmed-6477666
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-64776662019-05-16 Implementation of Deep Deterministic Policy Gradients for Controlling Dynamic Bipedal Walking Liu, Chujun Lonsberry, Andrew G. Nandor, Mark J. Audu, Musa L. Lonsberry, Alexander J. Quinn, Roger D. Biomimetics (Basel) Article A control system for bipedal walking in the sagittal plane was developed in simulation. The biped model was built based on anthropometric data for a 1.8 m tall male of average build. At the core of the controller is a deep deterministic policy gradient (DDPG) neural network that was trained in GAZEBO, a physics simulator, to predict the ideal foot placement to maintain stable walking despite external disturbances. The complexity of the DDPG network was decreased through carefully selected state variables and a distributed control system. Additional controllers for the hip joints during their stance phases and the ankle joint during toe-off phase help to stabilize the biped during walking. The simulated biped can walk at a steady pace of approximately 1 m/s, and during locomotion it can maintain stability with a 30 kg·m/s impulse applied forward on the torso or a 40 kg·m/s impulse applied rearward. It also maintains stable walking with a 10 kg backpack or a 25 kg front pack. The controller was trained on a 1.8 m tall model, but also stabilizes models 1.4–2.3 m tall with no changes. MDPI 2019-03-22 /pmc/articles/PMC6477666/ /pubmed/31105213 http://dx.doi.org/10.3390/biomimetics4010028 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Liu, Chujun
Lonsberry, Andrew G.
Nandor, Mark J.
Audu, Musa L.
Lonsberry, Alexander J.
Quinn, Roger D.
Implementation of Deep Deterministic Policy Gradients for Controlling Dynamic Bipedal Walking
title Implementation of Deep Deterministic Policy Gradients for Controlling Dynamic Bipedal Walking
title_full Implementation of Deep Deterministic Policy Gradients for Controlling Dynamic Bipedal Walking
title_fullStr Implementation of Deep Deterministic Policy Gradients for Controlling Dynamic Bipedal Walking
title_full_unstemmed Implementation of Deep Deterministic Policy Gradients for Controlling Dynamic Bipedal Walking
title_short Implementation of Deep Deterministic Policy Gradients for Controlling Dynamic Bipedal Walking
title_sort implementation of deep deterministic policy gradients for controlling dynamic bipedal walking
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6477666/
https://www.ncbi.nlm.nih.gov/pubmed/31105213
http://dx.doi.org/10.3390/biomimetics4010028
work_keys_str_mv AT liuchujun implementationofdeepdeterministicpolicygradientsforcontrollingdynamicbipedalwalking
AT lonsberryandrewg implementationofdeepdeterministicpolicygradientsforcontrollingdynamicbipedalwalking
AT nandormarkj implementationofdeepdeterministicpolicygradientsforcontrollingdynamicbipedalwalking
AT audumusal implementationofdeepdeterministicpolicygradientsforcontrollingdynamicbipedalwalking
AT lonsberryalexanderj implementationofdeepdeterministicpolicygradientsforcontrollingdynamicbipedalwalking
AT quinnrogerd implementationofdeepdeterministicpolicygradientsforcontrollingdynamicbipedalwalking