Cargando…

Comparing Deep Reinforcement Learning Algorithms’ Ability to Safely Navigate Challenging Waters

Reinforcement Learning (RL) controllers have proved to effectively tackle the dual objectives of path following and collision avoidance. However, finding which RL algorithm setup optimally trades off these two tasks is not necessarily easy. This work proposes a methodology to explore this that lever...

Descripción completa

Detalles Bibliográficos
Autores principales: Larsen, Thomas Nakken, Teigen, Halvor Ødegård, Laache, Torkel, Varagnolo, Damiano, Rasheed, Adil
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8473616/
https://www.ncbi.nlm.nih.gov/pubmed/34589522
http://dx.doi.org/10.3389/frobt.2021.738113
_version_ 1784575028053934080
author Larsen, Thomas Nakken
Teigen, Halvor Ødegård
Laache, Torkel
Varagnolo, Damiano
Rasheed, Adil
author_facet Larsen, Thomas Nakken
Teigen, Halvor Ødegård
Laache, Torkel
Varagnolo, Damiano
Rasheed, Adil
author_sort Larsen, Thomas Nakken
collection PubMed
description Reinforcement Learning (RL) controllers have proved to effectively tackle the dual objectives of path following and collision avoidance. However, finding which RL algorithm setup optimally trades off these two tasks is not necessarily easy. This work proposes a methodology to explore this that leverages analyzing the performance and task-specific behavioral characteristics for a range of RL algorithms applied to path-following and collision-avoidance for underactuated surface vehicles in environments of increasing complexity. Compared to the introduced RL algorithms, the results show that the Proximal Policy Optimization (PPO) algorithm exhibits superior robustness to changes in the environment complexity, the reward function, and when generalized to environments with a considerable domain gap from the training environment. Whereas the proposed reward function significantly improves the competing algorithms’ ability to solve the training environment, an unexpected consequence of the dimensionality reduction in the sensor suite, combined with the domain gap, is identified as the source of their impaired generalization performance.
format Online
Article
Text
id pubmed-8473616
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-84736162021-09-28 Comparing Deep Reinforcement Learning Algorithms’ Ability to Safely Navigate Challenging Waters Larsen, Thomas Nakken Teigen, Halvor Ødegård Laache, Torkel Varagnolo, Damiano Rasheed, Adil Front Robot AI Robotics and AI Reinforcement Learning (RL) controllers have proved to effectively tackle the dual objectives of path following and collision avoidance. However, finding which RL algorithm setup optimally trades off these two tasks is not necessarily easy. This work proposes a methodology to explore this that leverages analyzing the performance and task-specific behavioral characteristics for a range of RL algorithms applied to path-following and collision-avoidance for underactuated surface vehicles in environments of increasing complexity. Compared to the introduced RL algorithms, the results show that the Proximal Policy Optimization (PPO) algorithm exhibits superior robustness to changes in the environment complexity, the reward function, and when generalized to environments with a considerable domain gap from the training environment. Whereas the proposed reward function significantly improves the competing algorithms’ ability to solve the training environment, an unexpected consequence of the dimensionality reduction in the sensor suite, combined with the domain gap, is identified as the source of their impaired generalization performance. Frontiers Media S.A. 2021-09-13 /pmc/articles/PMC8473616/ /pubmed/34589522 http://dx.doi.org/10.3389/frobt.2021.738113 Text en Copyright © 2021 Larsen, Teigen, Laache, Varagnolo and Rasheed. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Robotics and AI
Larsen, Thomas Nakken
Teigen, Halvor Ødegård
Laache, Torkel
Varagnolo, Damiano
Rasheed, Adil
Comparing Deep Reinforcement Learning Algorithms’ Ability to Safely Navigate Challenging Waters
title Comparing Deep Reinforcement Learning Algorithms’ Ability to Safely Navigate Challenging Waters
title_full Comparing Deep Reinforcement Learning Algorithms’ Ability to Safely Navigate Challenging Waters
title_fullStr Comparing Deep Reinforcement Learning Algorithms’ Ability to Safely Navigate Challenging Waters
title_full_unstemmed Comparing Deep Reinforcement Learning Algorithms’ Ability to Safely Navigate Challenging Waters
title_short Comparing Deep Reinforcement Learning Algorithms’ Ability to Safely Navigate Challenging Waters
title_sort comparing deep reinforcement learning algorithms’ ability to safely navigate challenging waters
topic Robotics and AI
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8473616/
https://www.ncbi.nlm.nih.gov/pubmed/34589522
http://dx.doi.org/10.3389/frobt.2021.738113
work_keys_str_mv AT larsenthomasnakken comparingdeepreinforcementlearningalgorithmsabilitytosafelynavigatechallengingwaters
AT teigenhalvorødegard comparingdeepreinforcementlearningalgorithmsabilitytosafelynavigatechallengingwaters
AT laachetorkel comparingdeepreinforcementlearningalgorithmsabilitytosafelynavigatechallengingwaters
AT varagnolodamiano comparingdeepreinforcementlearningalgorithmsabilitytosafelynavigatechallengingwaters
AT rasheedadil comparingdeepreinforcementlearningalgorithmsabilitytosafelynavigatechallengingwaters