Cargando…
Data Efficient Reinforcement Learning for Integrated Lateral Planning and Control in Automated Parking System
Reinforcement learning (RL) is a promising direction in automated parking systems (APSs), as integrating planning and tracking control using RL can potentially maximize the overall performance. However, commonly used model-free RL requires many interactions to achieve acceptable performance, and mod...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7766926/ https://www.ncbi.nlm.nih.gov/pubmed/33353153 http://dx.doi.org/10.3390/s20247297 |
_version_ | 1783628835560882176 |
---|---|
author | Song, Shaoyu Chen, Hui Sun, Hongwei Liu, Meicen |
author_facet | Song, Shaoyu Chen, Hui Sun, Hongwei Liu, Meicen |
author_sort | Song, Shaoyu |
collection | PubMed |
description | Reinforcement learning (RL) is a promising direction in automated parking systems (APSs), as integrating planning and tracking control using RL can potentially maximize the overall performance. However, commonly used model-free RL requires many interactions to achieve acceptable performance, and model-based RL in APS cannot continuously learn. In this paper, a data-efficient RL method is constructed to learn from data by use of a model-based method. The proposed method uses a truncated Monte Carlo tree search to evaluate parking states and select moves. Two artificial neural networks are trained to provide the search probability of each tree branch and the final reward for each state using self-trained data. The data efficiency is enhanced by weighting exploration with parking trajectory returns, an adaptive exploration scheme, and experience augmentation with imaginary rollouts. Without human demonstrations, a novel training pipeline is also used to train the initial action guidance network and the state value network. Compared with path planning and path-following methods, the proposed integrated method can flexibly co-ordinate the longitudinal and lateral motion to park a smaller parking space in one maneuver. Its adaptability to changes in the vehicle model is verified by joint Carsim and MATLAB simulation, demonstrating that the algorithm converges within a few iterations. Finally, experiments using a real vehicle platform are used to further verify the effectiveness of the proposed method. Compared with obtaining rewards using simulation, the proposed method achieves a better final parking attitude and success rate. |
format | Online Article Text |
id | pubmed-7766926 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-77669262020-12-28 Data Efficient Reinforcement Learning for Integrated Lateral Planning and Control in Automated Parking System Song, Shaoyu Chen, Hui Sun, Hongwei Liu, Meicen Sensors (Basel) Article Reinforcement learning (RL) is a promising direction in automated parking systems (APSs), as integrating planning and tracking control using RL can potentially maximize the overall performance. However, commonly used model-free RL requires many interactions to achieve acceptable performance, and model-based RL in APS cannot continuously learn. In this paper, a data-efficient RL method is constructed to learn from data by use of a model-based method. The proposed method uses a truncated Monte Carlo tree search to evaluate parking states and select moves. Two artificial neural networks are trained to provide the search probability of each tree branch and the final reward for each state using self-trained data. The data efficiency is enhanced by weighting exploration with parking trajectory returns, an adaptive exploration scheme, and experience augmentation with imaginary rollouts. Without human demonstrations, a novel training pipeline is also used to train the initial action guidance network and the state value network. Compared with path planning and path-following methods, the proposed integrated method can flexibly co-ordinate the longitudinal and lateral motion to park a smaller parking space in one maneuver. Its adaptability to changes in the vehicle model is verified by joint Carsim and MATLAB simulation, demonstrating that the algorithm converges within a few iterations. Finally, experiments using a real vehicle platform are used to further verify the effectiveness of the proposed method. Compared with obtaining rewards using simulation, the proposed method achieves a better final parking attitude and success rate. MDPI 2020-12-18 /pmc/articles/PMC7766926/ /pubmed/33353153 http://dx.doi.org/10.3390/s20247297 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Song, Shaoyu Chen, Hui Sun, Hongwei Liu, Meicen Data Efficient Reinforcement Learning for Integrated Lateral Planning and Control in Automated Parking System |
title | Data Efficient Reinforcement Learning for Integrated Lateral Planning and Control in Automated Parking System |
title_full | Data Efficient Reinforcement Learning for Integrated Lateral Planning and Control in Automated Parking System |
title_fullStr | Data Efficient Reinforcement Learning for Integrated Lateral Planning and Control in Automated Parking System |
title_full_unstemmed | Data Efficient Reinforcement Learning for Integrated Lateral Planning and Control in Automated Parking System |
title_short | Data Efficient Reinforcement Learning for Integrated Lateral Planning and Control in Automated Parking System |
title_sort | data efficient reinforcement learning for integrated lateral planning and control in automated parking system |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7766926/ https://www.ncbi.nlm.nih.gov/pubmed/33353153 http://dx.doi.org/10.3390/s20247297 |
work_keys_str_mv | AT songshaoyu dataefficientreinforcementlearningforintegratedlateralplanningandcontrolinautomatedparkingsystem AT chenhui dataefficientreinforcementlearningforintegratedlateralplanningandcontrolinautomatedparkingsystem AT sunhongwei dataefficientreinforcementlearningforintegratedlateralplanningandcontrolinautomatedparkingsystem AT liumeicen dataefficientreinforcementlearningforintegratedlateralplanningandcontrolinautomatedparkingsystem |