Cargando…

Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task

Linearly solvable Markov Decision Process (LMDP) is a class of optimal control problem in which the Bellman's equation can be converted into a linear equation by an exponential transformation of the state value function (Todorov, 2009b). In an LMDP, the optimal value function and the correspond...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kinjo, Ken, Uchibe, Eiji, Doya, Kenji
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2013
Materias:	Neuroscience
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3617398/ https://www.ncbi.nlm.nih.gov/pubmed/23576983 http://dx.doi.org/10.3389/fnbot.2013.00007

_version_	1782265258748936192
author	Kinjo, Ken Uchibe, Eiji Doya, Kenji
author_facet	Kinjo, Ken Uchibe, Eiji Doya, Kenji
author_sort	Kinjo, Ken
collection	PubMed
description	Linearly solvable Markov Decision Process (LMDP) is a class of optimal control problem in which the Bellman's equation can be converted into a linear equation by an exponential transformation of the state value function (Todorov, 2009b). In an LMDP, the optimal value function and the corresponding control policy are obtained by solving an eigenvalue problem in a discrete state space or an eigenfunction problem in a continuous state using the knowledge of the system dynamics and the action, state, and terminal cost functions. In this study, we evaluate the effectiveness of the LMDP framework in real robot control, in which the dynamics of the body and the environment have to be learned from experience. We first perform a simulation study of a pole swing-up task to evaluate the effect of the accuracy of the learned dynamics model on the derived the action policy. The result shows that a crude linear approximation of the non-linear dynamics can still allow solution of the task, despite with a higher total cost. We then perform real robot experiments of a battery-catching task using our Spring Dog mobile robot platform. The state is given by the position and the size of a battery in its camera view and two neck joint angles. The action is the velocities of two wheels, while the neck joints were controlled by a visual servo controller. We test linear and bilinear dynamic models in tasks with quadratic and Guassian state cost functions. In the quadratic cost task, the LMDP controller derived from a learned linear dynamics model performed equivalently with the optimal linear quadratic regulator (LQR). In the non-quadratic task, the LMDP controller with a linear dynamics model showed the best performance. The results demonstrate the usefulness of the LMDP framework in real robot control even when simple linear models are used for dynamics learning.
format	Online Article Text
id	pubmed-3617398
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-36173982013-04-10 Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task Kinjo, Ken Uchibe, Eiji Doya, Kenji Front Neurorobot Neuroscience Linearly solvable Markov Decision Process (LMDP) is a class of optimal control problem in which the Bellman's equation can be converted into a linear equation by an exponential transformation of the state value function (Todorov, 2009b). In an LMDP, the optimal value function and the corresponding control policy are obtained by solving an eigenvalue problem in a discrete state space or an eigenfunction problem in a continuous state using the knowledge of the system dynamics and the action, state, and terminal cost functions. In this study, we evaluate the effectiveness of the LMDP framework in real robot control, in which the dynamics of the body and the environment have to be learned from experience. We first perform a simulation study of a pole swing-up task to evaluate the effect of the accuracy of the learned dynamics model on the derived the action policy. The result shows that a crude linear approximation of the non-linear dynamics can still allow solution of the task, despite with a higher total cost. We then perform real robot experiments of a battery-catching task using our Spring Dog mobile robot platform. The state is given by the position and the size of a battery in its camera view and two neck joint angles. The action is the velocities of two wheels, while the neck joints were controlled by a visual servo controller. We test linear and bilinear dynamic models in tasks with quadratic and Guassian state cost functions. In the quadratic cost task, the LMDP controller derived from a learned linear dynamics model performed equivalently with the optimal linear quadratic regulator (LQR). In the non-quadratic task, the LMDP controller with a linear dynamics model showed the best performance. The results demonstrate the usefulness of the LMDP framework in real robot control even when simple linear models are used for dynamics learning. Frontiers Media S.A. 2013-04-05 /pmc/articles/PMC3617398/ /pubmed/23576983 http://dx.doi.org/10.3389/fnbot.2013.00007 Text en Copyright © 2013 Kinjo, Uchibe and Doya. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.
spellingShingle	Neuroscience Kinjo, Ken Uchibe, Eiji Doya, Kenji Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task
title	Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task
title_full	Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task
title_fullStr	Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task
title_full_unstemmed	Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task
title_short	Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task
title_sort	evaluation of linearly solvable markov decision process with dynamic model learning in a mobile robot navigation task
topic	Neuroscience
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3617398/ https://www.ncbi.nlm.nih.gov/pubmed/23576983 http://dx.doi.org/10.3389/fnbot.2013.00007
work_keys_str_mv	AT kinjoken evaluationoflinearlysolvablemarkovdecisionprocesswithdynamicmodellearninginamobilerobotnavigationtask AT uchibeeiji evaluationoflinearlysolvablemarkovdecisionprocesswithdynamicmodellearninginamobilerobotnavigationtask AT doyakenji evaluationoflinearlysolvablemarkovdecisionprocesswithdynamicmodellearninginamobilerobotnavigationtask

Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task

Ejemplares similares