Cargando…

Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots

We address the problem of learning relationships on state variables in Partially Observable Markov Decision Processes (POMDPs) to improve planning performance. Specifically, we focus on Partially Observable Monte Carlo Planning (POMCP) and represent the acquired knowledge with a Markov Random Field...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zuccotto, Maddalena, Piccinelli, Marco, Castellini, Alberto, Marchesini, Enrico, Farinelli, Alessandro
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2022
Materias:	Robotics and AI
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9343685/ https://www.ncbi.nlm.nih.gov/pubmed/35928541 http://dx.doi.org/10.3389/frobt.2022.819107

_version_	1784761043053969408
author	Zuccotto, Maddalena Piccinelli, Marco Castellini, Alberto Marchesini, Enrico Farinelli, Alessandro
author_facet	Zuccotto, Maddalena Piccinelli, Marco Castellini, Alberto Marchesini, Enrico Farinelli, Alessandro
author_sort	Zuccotto, Maddalena
collection	PubMed
description	We address the problem of learning relationships on state variables in Partially Observable Markov Decision Processes (POMDPs) to improve planning performance. Specifically, we focus on Partially Observable Monte Carlo Planning (POMCP) and represent the acquired knowledge with a Markov Random Field (MRF). We propose, in particular, a method for learning these relationships on a robot as POMCP is used to plan future actions. Then, we present an algorithm that deals with cases in which the MRF is used on episodes having unlikely states with respect to the equality relationships represented by the MRF. Our approach acquires information from the agent’s action outcomes to adapt online the MRF if a mismatch is detected between the MRF and the true state. We test this technique on two domains, rocksample, a standard rover exploration task, and a problem of velocity regulation in industrial mobile robotic platforms, showing that the MRF adaptation algorithm improves the planning performance with respect to the standard approach, which does not adapt the MRF online. Finally, a ROS-based architecture is proposed, which allows running the MRF learning, the MRF adaptation, and MRF usage in POMCP on real robotic platforms. In this case, we successfully tested the architecture on a Gazebo simulator of rocksample. A video of the experiments is available in the Supplementary Material, and the code of the ROS-based architecture is available online.
format	Online Article Text
id	pubmed-9343685
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-93436852022-08-03 Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots Zuccotto, Maddalena Piccinelli, Marco Castellini, Alberto Marchesini, Enrico Farinelli, Alessandro Front Robot AI Robotics and AI We address the problem of learning relationships on state variables in Partially Observable Markov Decision Processes (POMDPs) to improve planning performance. Specifically, we focus on Partially Observable Monte Carlo Planning (POMCP) and represent the acquired knowledge with a Markov Random Field (MRF). We propose, in particular, a method for learning these relationships on a robot as POMCP is used to plan future actions. Then, we present an algorithm that deals with cases in which the MRF is used on episodes having unlikely states with respect to the equality relationships represented by the MRF. Our approach acquires information from the agent’s action outcomes to adapt online the MRF if a mismatch is detected between the MRF and the true state. We test this technique on two domains, rocksample, a standard rover exploration task, and a problem of velocity regulation in industrial mobile robotic platforms, showing that the MRF adaptation algorithm improves the planning performance with respect to the standard approach, which does not adapt the MRF online. Finally, a ROS-based architecture is proposed, which allows running the MRF learning, the MRF adaptation, and MRF usage in POMCP on real robotic platforms. In this case, we successfully tested the architecture on a Gazebo simulator of rocksample. A video of the experiments is available in the Supplementary Material, and the code of the ROS-based architecture is available online. Frontiers Media S.A. 2022-07-19 /pmc/articles/PMC9343685/ /pubmed/35928541 http://dx.doi.org/10.3389/frobt.2022.819107 Text en Copyright © 2022 Zuccotto, Piccinelli, Castellini, Marchesini and Farinelli. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Robotics and AI Zuccotto, Maddalena Piccinelli, Marco Castellini, Alberto Marchesini, Enrico Farinelli, Alessandro Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots
title	Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots
title_full	Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots
title_fullStr	Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots
title_full_unstemmed	Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots
title_short	Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots
title_sort	learning state-variable relationships in pomcp: a framework for mobile robots
topic	Robotics and AI
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9343685/ https://www.ncbi.nlm.nih.gov/pubmed/35928541 http://dx.doi.org/10.3389/frobt.2022.819107
work_keys_str_mv	AT zuccottomaddalena learningstatevariablerelationshipsinpomcpaframeworkformobilerobots AT piccinellimarco learningstatevariablerelationshipsinpomcpaframeworkformobilerobots AT castellinialberto learningstatevariablerelationshipsinpomcpaframeworkformobilerobots AT marchesinienrico learningstatevariablerelationshipsinpomcpaframeworkformobilerobots AT farinellialessandro learningstatevariablerelationshipsinpomcpaframeworkformobilerobots

Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots

Ejemplares similares