Cargando…
Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots
We address the problem of learning relationships on state variables in Partially Observable Markov Decision Processes (POMDPs) to improve planning performance. Specifically, we focus on Partially Observable Monte Carlo Planning (POMCP) and represent the acquired knowledge with a Markov Random Field...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9343685/ https://www.ncbi.nlm.nih.gov/pubmed/35928541 http://dx.doi.org/10.3389/frobt.2022.819107 |
_version_ | 1784761043053969408 |
---|---|
author | Zuccotto, Maddalena Piccinelli, Marco Castellini, Alberto Marchesini, Enrico Farinelli, Alessandro |
author_facet | Zuccotto, Maddalena Piccinelli, Marco Castellini, Alberto Marchesini, Enrico Farinelli, Alessandro |
author_sort | Zuccotto, Maddalena |
collection | PubMed |
description | We address the problem of learning relationships on state variables in Partially Observable Markov Decision Processes (POMDPs) to improve planning performance. Specifically, we focus on Partially Observable Monte Carlo Planning (POMCP) and represent the acquired knowledge with a Markov Random Field (MRF). We propose, in particular, a method for learning these relationships on a robot as POMCP is used to plan future actions. Then, we present an algorithm that deals with cases in which the MRF is used on episodes having unlikely states with respect to the equality relationships represented by the MRF. Our approach acquires information from the agent’s action outcomes to adapt online the MRF if a mismatch is detected between the MRF and the true state. We test this technique on two domains, rocksample, a standard rover exploration task, and a problem of velocity regulation in industrial mobile robotic platforms, showing that the MRF adaptation algorithm improves the planning performance with respect to the standard approach, which does not adapt the MRF online. Finally, a ROS-based architecture is proposed, which allows running the MRF learning, the MRF adaptation, and MRF usage in POMCP on real robotic platforms. In this case, we successfully tested the architecture on a Gazebo simulator of rocksample. A video of the experiments is available in the Supplementary Material, and the code of the ROS-based architecture is available online. |
format | Online Article Text |
id | pubmed-9343685 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-93436852022-08-03 Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots Zuccotto, Maddalena Piccinelli, Marco Castellini, Alberto Marchesini, Enrico Farinelli, Alessandro Front Robot AI Robotics and AI We address the problem of learning relationships on state variables in Partially Observable Markov Decision Processes (POMDPs) to improve planning performance. Specifically, we focus on Partially Observable Monte Carlo Planning (POMCP) and represent the acquired knowledge with a Markov Random Field (MRF). We propose, in particular, a method for learning these relationships on a robot as POMCP is used to plan future actions. Then, we present an algorithm that deals with cases in which the MRF is used on episodes having unlikely states with respect to the equality relationships represented by the MRF. Our approach acquires information from the agent’s action outcomes to adapt online the MRF if a mismatch is detected between the MRF and the true state. We test this technique on two domains, rocksample, a standard rover exploration task, and a problem of velocity regulation in industrial mobile robotic platforms, showing that the MRF adaptation algorithm improves the planning performance with respect to the standard approach, which does not adapt the MRF online. Finally, a ROS-based architecture is proposed, which allows running the MRF learning, the MRF adaptation, and MRF usage in POMCP on real robotic platforms. In this case, we successfully tested the architecture on a Gazebo simulator of rocksample. A video of the experiments is available in the Supplementary Material, and the code of the ROS-based architecture is available online. Frontiers Media S.A. 2022-07-19 /pmc/articles/PMC9343685/ /pubmed/35928541 http://dx.doi.org/10.3389/frobt.2022.819107 Text en Copyright © 2022 Zuccotto, Piccinelli, Castellini, Marchesini and Farinelli. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Robotics and AI Zuccotto, Maddalena Piccinelli, Marco Castellini, Alberto Marchesini, Enrico Farinelli, Alessandro Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots |
title | Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots |
title_full | Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots |
title_fullStr | Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots |
title_full_unstemmed | Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots |
title_short | Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots |
title_sort | learning state-variable relationships in pomcp: a framework for mobile robots |
topic | Robotics and AI |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9343685/ https://www.ncbi.nlm.nih.gov/pubmed/35928541 http://dx.doi.org/10.3389/frobt.2022.819107 |
work_keys_str_mv | AT zuccottomaddalena learningstatevariablerelationshipsinpomcpaframeworkformobilerobots AT piccinellimarco learningstatevariablerelationshipsinpomcpaframeworkformobilerobots AT castellinialberto learningstatevariablerelationshipsinpomcpaframeworkformobilerobots AT marchesinienrico learningstatevariablerelationshipsinpomcpaframeworkformobilerobots AT farinellialessandro learningstatevariablerelationshipsinpomcpaframeworkformobilerobots |