Cargando…

Humanoids Learning to Walk: A Natural CPG-Actor-Critic Architecture

The identification of learning mechanisms for locomotion has been the subject of much research for some time but many challenges remain. Dynamic systems theory (DST) offers a novel approach to humanoid learning through environmental interaction. Reinforcement learning (RL) has offered a promising me...

Descripción completa

Detalles Bibliográficos
Autores principales:	Li, Cai, Lowe, Robert, Ziemke, Tom
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2013
Materias:	Neuroscience
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3619089/ https://www.ncbi.nlm.nih.gov/pubmed/23675345 http://dx.doi.org/10.3389/fnbot.2013.00005

_version_	1782265463732961280
author	Li, Cai Lowe, Robert Ziemke, Tom
author_facet	Li, Cai Lowe, Robert Ziemke, Tom
author_sort	Li, Cai
collection	PubMed
description	The identification of learning mechanisms for locomotion has been the subject of much research for some time but many challenges remain. Dynamic systems theory (DST) offers a novel approach to humanoid learning through environmental interaction. Reinforcement learning (RL) has offered a promising method to adaptively link the dynamic system to the environment it interacts with via a reward-based value system. In this paper, we propose a model that integrates the above perspectives and applies it to the case of a humanoid (NAO) robot learning to walk the ability of which emerges from its value-based interaction with the environment. In the model, a simplified central pattern generator (CPG) architecture inspired by neuroscientific research and DST is integrated with an actor-critic approach to RL (cpg-actor-critic). In the cpg-actor-critic architecture, least-square-temporal-difference based learning converges to the optimal solution quickly by using natural gradient learning and balancing exploration and exploitation. Futhermore, rather than using a traditional (designer-specified) reward it uses a dynamic value function as a stability indicator that adapts to the environment. The results obtained are analyzed using a novel DST-based embodied cognition approach. Learning to walk, from this perspective, is a process of integrating levels of sensorimotor activity and value.
format	Online Article Text
id	pubmed-3619089
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-36190892013-05-14 Humanoids Learning to Walk: A Natural CPG-Actor-Critic Architecture Li, Cai Lowe, Robert Ziemke, Tom Front Neurorobot Neuroscience The identification of learning mechanisms for locomotion has been the subject of much research for some time but many challenges remain. Dynamic systems theory (DST) offers a novel approach to humanoid learning through environmental interaction. Reinforcement learning (RL) has offered a promising method to adaptively link the dynamic system to the environment it interacts with via a reward-based value system. In this paper, we propose a model that integrates the above perspectives and applies it to the case of a humanoid (NAO) robot learning to walk the ability of which emerges from its value-based interaction with the environment. In the model, a simplified central pattern generator (CPG) architecture inspired by neuroscientific research and DST is integrated with an actor-critic approach to RL (cpg-actor-critic). In the cpg-actor-critic architecture, least-square-temporal-difference based learning converges to the optimal solution quickly by using natural gradient learning and balancing exploration and exploitation. Futhermore, rather than using a traditional (designer-specified) reward it uses a dynamic value function as a stability indicator that adapts to the environment. The results obtained are analyzed using a novel DST-based embodied cognition approach. Learning to walk, from this perspective, is a process of integrating levels of sensorimotor activity and value. Frontiers Media S.A. 2013-04-08 /pmc/articles/PMC3619089/ /pubmed/23675345 http://dx.doi.org/10.3389/fnbot.2013.00005 Text en Copyright © 2013 Li, Lowe and Ziemke. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.
spellingShingle	Neuroscience Li, Cai Lowe, Robert Ziemke, Tom Humanoids Learning to Walk: A Natural CPG-Actor-Critic Architecture
title	Humanoids Learning to Walk: A Natural CPG-Actor-Critic Architecture
title_full	Humanoids Learning to Walk: A Natural CPG-Actor-Critic Architecture
title_fullStr	Humanoids Learning to Walk: A Natural CPG-Actor-Critic Architecture
title_full_unstemmed	Humanoids Learning to Walk: A Natural CPG-Actor-Critic Architecture
title_short	Humanoids Learning to Walk: A Natural CPG-Actor-Critic Architecture
title_sort	humanoids learning to walk: a natural cpg-actor-critic architecture
topic	Neuroscience
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3619089/ https://www.ncbi.nlm.nih.gov/pubmed/23675345 http://dx.doi.org/10.3389/fnbot.2013.00005
work_keys_str_mv	AT licai humanoidslearningtowalkanaturalcpgactorcriticarchitecture AT lowerobert humanoidslearningtowalkanaturalcpgactorcriticarchitecture AT ziemketom humanoidslearningtowalkanaturalcpgactorcriticarchitecture

Humanoids Learning to Walk: A Natural CPG-Actor-Critic Architecture

Ejemplares similares