Cargando…

Policy-Gradient and Actor-Critic Based State Representation Learning for Safe Driving of Autonomous Vehicles

In this paper, we propose an environment perception framework for autonomous driving using state representation learning (SRL). Unlike existing Q-learning based methods for efficient environment perception and object detection, our proposed method takes the learning loss into account under determini...

Descripción completa

Detalles Bibliográficos
Autores principales:	Gupta, Abhishek, Khwaja, Ahmed Shaharyar, Anpalagan, Alagan, Guan, Ling, Venkatesh, Bala
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7660054/ https://www.ncbi.nlm.nih.gov/pubmed/33105863 http://dx.doi.org/10.3390/s20215991

_version_	1783608928320356352
author	Gupta, Abhishek Khwaja, Ahmed Shaharyar Anpalagan, Alagan Guan, Ling Venkatesh, Bala
author_facet	Gupta, Abhishek Khwaja, Ahmed Shaharyar Anpalagan, Alagan Guan, Ling Venkatesh, Bala
author_sort	Gupta, Abhishek
collection	PubMed
description	In this paper, we propose an environment perception framework for autonomous driving using state representation learning (SRL). Unlike existing Q-learning based methods for efficient environment perception and object detection, our proposed method takes the learning loss into account under deterministic as well as stochastic policy gradient. Through a combination of variational autoencoder (VAE), deep deterministic policy gradient (DDPG), and soft actor-critic (SAC), we focus on uninterrupted and reasonably safe autonomous driving without steering off the track for a considerable driving distance. Our proposed technique exhibits learning in autonomous vehicles under complex interactions with the environment, without being explicitly trained on driving datasets. To ensure the effectiveness of the scheme over a sustained period of time, we employ a reward-penalty based system where a negative reward is associated with an unfavourable action and a positive reward is awarded for favourable actions. The results obtained through simulations on DonKey simulator show the effectiveness of our proposed method by examining the variations in policy loss, value loss, reward function, and cumulative reward for ‘VAE+DDPG’ and ‘VAE+SAC’ over the learning process.
format	Online Article Text
id	pubmed-7660054
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-76600542020-11-13 Policy-Gradient and Actor-Critic Based State Representation Learning for Safe Driving of Autonomous Vehicles Gupta, Abhishek Khwaja, Ahmed Shaharyar Anpalagan, Alagan Guan, Ling Venkatesh, Bala Sensors (Basel) Article In this paper, we propose an environment perception framework for autonomous driving using state representation learning (SRL). Unlike existing Q-learning based methods for efficient environment perception and object detection, our proposed method takes the learning loss into account under deterministic as well as stochastic policy gradient. Through a combination of variational autoencoder (VAE), deep deterministic policy gradient (DDPG), and soft actor-critic (SAC), we focus on uninterrupted and reasonably safe autonomous driving without steering off the track for a considerable driving distance. Our proposed technique exhibits learning in autonomous vehicles under complex interactions with the environment, without being explicitly trained on driving datasets. To ensure the effectiveness of the scheme over a sustained period of time, we employ a reward-penalty based system where a negative reward is associated with an unfavourable action and a positive reward is awarded for favourable actions. The results obtained through simulations on DonKey simulator show the effectiveness of our proposed method by examining the variations in policy loss, value loss, reward function, and cumulative reward for ‘VAE+DDPG’ and ‘VAE+SAC’ over the learning process. MDPI 2020-10-22 /pmc/articles/PMC7660054/ /pubmed/33105863 http://dx.doi.org/10.3390/s20215991 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Gupta, Abhishek Khwaja, Ahmed Shaharyar Anpalagan, Alagan Guan, Ling Venkatesh, Bala Policy-Gradient and Actor-Critic Based State Representation Learning for Safe Driving of Autonomous Vehicles
title	Policy-Gradient and Actor-Critic Based State Representation Learning for Safe Driving of Autonomous Vehicles
title_full	Policy-Gradient and Actor-Critic Based State Representation Learning for Safe Driving of Autonomous Vehicles
title_fullStr	Policy-Gradient and Actor-Critic Based State Representation Learning for Safe Driving of Autonomous Vehicles
title_full_unstemmed	Policy-Gradient and Actor-Critic Based State Representation Learning for Safe Driving of Autonomous Vehicles
title_short	Policy-Gradient and Actor-Critic Based State Representation Learning for Safe Driving of Autonomous Vehicles
title_sort	policy-gradient and actor-critic based state representation learning for safe driving of autonomous vehicles
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7660054/ https://www.ncbi.nlm.nih.gov/pubmed/33105863 http://dx.doi.org/10.3390/s20215991
work_keys_str_mv	AT guptaabhishek policygradientandactorcriticbasedstaterepresentationlearningforsafedrivingofautonomousvehicles AT khwajaahmedshaharyar policygradientandactorcriticbasedstaterepresentationlearningforsafedrivingofautonomousvehicles AT anpalaganalagan policygradientandactorcriticbasedstaterepresentationlearningforsafedrivingofautonomousvehicles AT guanling policygradientandactorcriticbasedstaterepresentationlearningforsafedrivingofautonomousvehicles AT venkateshbala policygradientandactorcriticbasedstaterepresentationlearningforsafedrivingofautonomousvehicles

Policy-Gradient and Actor-Critic Based State Representation Learning for Safe Driving of Autonomous Vehicles

Ejemplares similares