Cargando…

Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration

In this paper a new framework, called Compressive Kernelized Reinforcement Learning (CKRL), for computing near-optimal policies in sequential decision making with uncertainty is proposed via incorporating the non-adaptive data-independent Random Projections and nonparametric Kernelized Least-squares...

Descripción completa

Detalles Bibliográficos
Autores principales:	Liu, Bo, Chen, Sanfeng, Li, Shuai, Liang, Yongsheng
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Molecular Diversity Preservation International (MDPI) 2012
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3376585/ https://www.ncbi.nlm.nih.gov/pubmed/22736969 http://dx.doi.org/10.3390/s120302632

_version_	1782235847860420608
author	Liu, Bo Chen, Sanfeng Li, Shuai Liang, Yongsheng
author_facet	Liu, Bo Chen, Sanfeng Li, Shuai Liang, Yongsheng
author_sort	Liu, Bo
collection	PubMed
description	In this paper a new framework, called Compressive Kernelized Reinforcement Learning (CKRL), for computing near-optimal policies in sequential decision making with uncertainty is proposed via incorporating the non-adaptive data-independent Random Projections and nonparametric Kernelized Least-squares Policy Iteration (KLSPI). Random Projections are a fast, non-adaptive dimensionality reduction framework in which high-dimensionality data is projected onto a random lower-dimension subspace via spherically random rotation and coordination sampling. KLSPI introduce kernel trick into the LSPI framework for Reinforcement Learning, often achieving faster convergence and providing automatic feature selection via various kernel sparsification approaches. In this approach, policies are computed in a low-dimensional subspace generated by projecting the high-dimensional features onto a set of random basis. We first show how Random Projections constitute an efficient sparsification technique and how our method often converges faster than regular LSPI, while at lower computational costs. Theoretical foundation underlying this approach is a fast approximation of Singular Value Decomposition (SVD). Finally, simulation results are exhibited on benchmark MDP domains, which confirm gains both in computation time and in performance in large feature spaces.
format	Online Article Text
id	pubmed-3376585
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	Molecular Diversity Preservation International (MDPI)
record_format	MEDLINE/PubMed
spelling	pubmed-33765852012-06-25 Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration Liu, Bo Chen, Sanfeng Li, Shuai Liang, Yongsheng Sensors (Basel) Article In this paper a new framework, called Compressive Kernelized Reinforcement Learning (CKRL), for computing near-optimal policies in sequential decision making with uncertainty is proposed via incorporating the non-adaptive data-independent Random Projections and nonparametric Kernelized Least-squares Policy Iteration (KLSPI). Random Projections are a fast, non-adaptive dimensionality reduction framework in which high-dimensionality data is projected onto a random lower-dimension subspace via spherically random rotation and coordination sampling. KLSPI introduce kernel trick into the LSPI framework for Reinforcement Learning, often achieving faster convergence and providing automatic feature selection via various kernel sparsification approaches. In this approach, policies are computed in a low-dimensional subspace generated by projecting the high-dimensional features onto a set of random basis. We first show how Random Projections constitute an efficient sparsification technique and how our method often converges faster than regular LSPI, while at lower computational costs. Theoretical foundation underlying this approach is a fast approximation of Singular Value Decomposition (SVD). Finally, simulation results are exhibited on benchmark MDP domains, which confirm gains both in computation time and in performance in large feature spaces. Molecular Diversity Preservation International (MDPI) 2012-02-28 /pmc/articles/PMC3376585/ /pubmed/22736969 http://dx.doi.org/10.3390/s120302632 Text en © 2012 by the authors; licensee MDPI, Basel, Switzerland This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
spellingShingle	Article Liu, Bo Chen, Sanfeng Li, Shuai Liang, Yongsheng Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration
title	Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration
title_full	Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration
title_fullStr	Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration
title_full_unstemmed	Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration
title_short	Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration
title_sort	intelligent control of a sensor-actuator system via kernelized least-squares policy iteration
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3376585/ https://www.ncbi.nlm.nih.gov/pubmed/22736969 http://dx.doi.org/10.3390/s120302632
work_keys_str_mv	AT liubo intelligentcontrolofasensoractuatorsystemviakernelizedleastsquarespolicyiteration AT chensanfeng intelligentcontrolofasensoractuatorsystemviakernelizedleastsquarespolicyiteration AT lishuai intelligentcontrolofasensoractuatorsystemviakernelizedleastsquarespolicyiteration AT liangyongsheng intelligentcontrolofasensoractuatorsystemviakernelizedleastsquarespolicyiteration

Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration

Ejemplares similares