Cargando…

Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots

We address the problem of learning relationships on state variables in Partially Observable Markov Decision Processes (POMDPs) to improve planning performance. Specifically, we focus on Partially Observable Monte Carlo Planning (POMCP) and represent the acquired knowledge with a Markov Random Field...

Descripción completa

Detalles Bibliográficos
Autores principales: Zuccotto, Maddalena, Piccinelli, Marco, Castellini, Alberto, Marchesini, Enrico, Farinelli, Alessandro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9343685/
https://www.ncbi.nlm.nih.gov/pubmed/35928541
http://dx.doi.org/10.3389/frobt.2022.819107
_version_ 1784761043053969408
author Zuccotto, Maddalena
Piccinelli, Marco
Castellini, Alberto
Marchesini, Enrico
Farinelli, Alessandro
author_facet Zuccotto, Maddalena
Piccinelli, Marco
Castellini, Alberto
Marchesini, Enrico
Farinelli, Alessandro
author_sort Zuccotto, Maddalena
collection PubMed
description We address the problem of learning relationships on state variables in Partially Observable Markov Decision Processes (POMDPs) to improve planning performance. Specifically, we focus on Partially Observable Monte Carlo Planning (POMCP) and represent the acquired knowledge with a Markov Random Field (MRF). We propose, in particular, a method for learning these relationships on a robot as POMCP is used to plan future actions. Then, we present an algorithm that deals with cases in which the MRF is used on episodes having unlikely states with respect to the equality relationships represented by the MRF. Our approach acquires information from the agent’s action outcomes to adapt online the MRF if a mismatch is detected between the MRF and the true state. We test this technique on two domains, rocksample, a standard rover exploration task, and a problem of velocity regulation in industrial mobile robotic platforms, showing that the MRF adaptation algorithm improves the planning performance with respect to the standard approach, which does not adapt the MRF online. Finally, a ROS-based architecture is proposed, which allows running the MRF learning, the MRF adaptation, and MRF usage in POMCP on real robotic platforms. In this case, we successfully tested the architecture on a Gazebo simulator of rocksample. A video of the experiments is available in the Supplementary Material, and the code of the ROS-based architecture is available online.
format Online
Article
Text
id pubmed-9343685
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-93436852022-08-03 Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots Zuccotto, Maddalena Piccinelli, Marco Castellini, Alberto Marchesini, Enrico Farinelli, Alessandro Front Robot AI Robotics and AI We address the problem of learning relationships on state variables in Partially Observable Markov Decision Processes (POMDPs) to improve planning performance. Specifically, we focus on Partially Observable Monte Carlo Planning (POMCP) and represent the acquired knowledge with a Markov Random Field (MRF). We propose, in particular, a method for learning these relationships on a robot as POMCP is used to plan future actions. Then, we present an algorithm that deals with cases in which the MRF is used on episodes having unlikely states with respect to the equality relationships represented by the MRF. Our approach acquires information from the agent’s action outcomes to adapt online the MRF if a mismatch is detected between the MRF and the true state. We test this technique on two domains, rocksample, a standard rover exploration task, and a problem of velocity regulation in industrial mobile robotic platforms, showing that the MRF adaptation algorithm improves the planning performance with respect to the standard approach, which does not adapt the MRF online. Finally, a ROS-based architecture is proposed, which allows running the MRF learning, the MRF adaptation, and MRF usage in POMCP on real robotic platforms. In this case, we successfully tested the architecture on a Gazebo simulator of rocksample. A video of the experiments is available in the Supplementary Material, and the code of the ROS-based architecture is available online. Frontiers Media S.A. 2022-07-19 /pmc/articles/PMC9343685/ /pubmed/35928541 http://dx.doi.org/10.3389/frobt.2022.819107 Text en Copyright © 2022 Zuccotto, Piccinelli, Castellini, Marchesini and Farinelli. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Robotics and AI
Zuccotto, Maddalena
Piccinelli, Marco
Castellini, Alberto
Marchesini, Enrico
Farinelli, Alessandro
Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots
title Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots
title_full Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots
title_fullStr Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots
title_full_unstemmed Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots
title_short Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots
title_sort learning state-variable relationships in pomcp: a framework for mobile robots
topic Robotics and AI
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9343685/
https://www.ncbi.nlm.nih.gov/pubmed/35928541
http://dx.doi.org/10.3389/frobt.2022.819107
work_keys_str_mv AT zuccottomaddalena learningstatevariablerelationshipsinpomcpaframeworkformobilerobots
AT piccinellimarco learningstatevariablerelationshipsinpomcpaframeworkformobilerobots
AT castellinialberto learningstatevariablerelationshipsinpomcpaframeworkformobilerobots
AT marchesinienrico learningstatevariablerelationshipsinpomcpaframeworkformobilerobots
AT farinellialessandro learningstatevariablerelationshipsinpomcpaframeworkformobilerobots