Cargando…

Reinforcement Q-Learning Control With Reward Shaping Function for Swing Phase Control in a Semi-active Prosthetic Knee

In this study, we investigated a control algorithm for a semi-active prosthetic knee based on reinforcement learning (RL). Model-free reinforcement Q-learning control with a reward shaping function was proposed as the voltage controller of a magnetorheological damper based on the prosthetic knee. Th...

Descripción completa

Detalles Bibliográficos
Autores principales: Hutabarat, Yonatan, Ekkachai, Kittipong, Hayashibe, Mitsuhiro, Kongprawechnon, Waree
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7726251/
https://www.ncbi.nlm.nih.gov/pubmed/33324190
http://dx.doi.org/10.3389/fnbot.2020.565702
_version_ 1783620842835410944
author Hutabarat, Yonatan
Ekkachai, Kittipong
Hayashibe, Mitsuhiro
Kongprawechnon, Waree
author_facet Hutabarat, Yonatan
Ekkachai, Kittipong
Hayashibe, Mitsuhiro
Kongprawechnon, Waree
author_sort Hutabarat, Yonatan
collection PubMed
description In this study, we investigated a control algorithm for a semi-active prosthetic knee based on reinforcement learning (RL). Model-free reinforcement Q-learning control with a reward shaping function was proposed as the voltage controller of a magnetorheological damper based on the prosthetic knee. The reward function was designed as a function of the performance index that accounts for the trajectory of the subject-specific knee angle. We compared our proposed reward function to a conventional single reward function under the same random initialization of a Q-matrix. We trained this control algorithm to adapt to several walking speed datasets under one control policy and subsequently compared its performance with that of other control algorithms. The results showed that our proposed reward function performed better than the conventional single reward function in terms of the normalized root mean squared error and also showed a faster convergence trend. Furthermore, our control strategy converged within our desired performance index and could adapt to several walking speeds. Our proposed control structure has also an overall better performance compared to user-adaptive control, while some of its walking speeds performed better than the neural network predictive control from existing studies.
format Online
Article
Text
id pubmed-7726251
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-77262512020-12-14 Reinforcement Q-Learning Control With Reward Shaping Function for Swing Phase Control in a Semi-active Prosthetic Knee Hutabarat, Yonatan Ekkachai, Kittipong Hayashibe, Mitsuhiro Kongprawechnon, Waree Front Neurorobot Neuroscience In this study, we investigated a control algorithm for a semi-active prosthetic knee based on reinforcement learning (RL). Model-free reinforcement Q-learning control with a reward shaping function was proposed as the voltage controller of a magnetorheological damper based on the prosthetic knee. The reward function was designed as a function of the performance index that accounts for the trajectory of the subject-specific knee angle. We compared our proposed reward function to a conventional single reward function under the same random initialization of a Q-matrix. We trained this control algorithm to adapt to several walking speed datasets under one control policy and subsequently compared its performance with that of other control algorithms. The results showed that our proposed reward function performed better than the conventional single reward function in terms of the normalized root mean squared error and also showed a faster convergence trend. Furthermore, our control strategy converged within our desired performance index and could adapt to several walking speeds. Our proposed control structure has also an overall better performance compared to user-adaptive control, while some of its walking speeds performed better than the neural network predictive control from existing studies. Frontiers Media S.A. 2020-11-26 /pmc/articles/PMC7726251/ /pubmed/33324190 http://dx.doi.org/10.3389/fnbot.2020.565702 Text en Copyright © 2020 Hutabarat, Ekkachai, Hayashibe and Kongprawechnon. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Hutabarat, Yonatan
Ekkachai, Kittipong
Hayashibe, Mitsuhiro
Kongprawechnon, Waree
Reinforcement Q-Learning Control With Reward Shaping Function for Swing Phase Control in a Semi-active Prosthetic Knee
title Reinforcement Q-Learning Control With Reward Shaping Function for Swing Phase Control in a Semi-active Prosthetic Knee
title_full Reinforcement Q-Learning Control With Reward Shaping Function for Swing Phase Control in a Semi-active Prosthetic Knee
title_fullStr Reinforcement Q-Learning Control With Reward Shaping Function for Swing Phase Control in a Semi-active Prosthetic Knee
title_full_unstemmed Reinforcement Q-Learning Control With Reward Shaping Function for Swing Phase Control in a Semi-active Prosthetic Knee
title_short Reinforcement Q-Learning Control With Reward Shaping Function for Swing Phase Control in a Semi-active Prosthetic Knee
title_sort reinforcement q-learning control with reward shaping function for swing phase control in a semi-active prosthetic knee
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7726251/
https://www.ncbi.nlm.nih.gov/pubmed/33324190
http://dx.doi.org/10.3389/fnbot.2020.565702
work_keys_str_mv AT hutabaratyonatan reinforcementqlearningcontrolwithrewardshapingfunctionforswingphasecontrolinasemiactiveprostheticknee
AT ekkachaikittipong reinforcementqlearningcontrolwithrewardshapingfunctionforswingphasecontrolinasemiactiveprostheticknee
AT hayashibemitsuhiro reinforcementqlearningcontrolwithrewardshapingfunctionforswingphasecontrolinasemiactiveprostheticknee
AT kongprawechnonwaree reinforcementqlearningcontrolwithrewardshapingfunctionforswingphasecontrolinasemiactiveprostheticknee