Cargando…

Maximum Causal Entropy Specification Inference from Demonstrations

In many settings, such as robotics, demonstrations provide a natural way to specify tasks. However, most methods for learning from demonstrations either do not provide guarantees that the learned artifacts can be safely composed or do not explicitly capture temporal properties. Motivated by this def...

Descripción completa

Detalles Bibliográficos
Autores principales: Vazquez-Chanlatte, Marcell, Seshia, Sanjit A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7363230/
http://dx.doi.org/10.1007/978-3-030-53291-8_15
_version_ 1783559628173344768
author Vazquez-Chanlatte, Marcell
Seshia, Sanjit A.
author_facet Vazquez-Chanlatte, Marcell
Seshia, Sanjit A.
author_sort Vazquez-Chanlatte, Marcell
collection PubMed
description In many settings, such as robotics, demonstrations provide a natural way to specify tasks. However, most methods for learning from demonstrations either do not provide guarantees that the learned artifacts can be safely composed or do not explicitly capture temporal properties. Motivated by this deficit, recent works have proposed learning Boolean task specifications, a class of Boolean non-Markovian rewards which admit well-defined composition and explicitly handle historical dependencies. This work continues this line of research by adapting maximum causal entropy inverse reinforcement learning to estimate the posteriori probability of a specification given a multi-set of demonstrations. The key algorithmic insight is to leverage the extensive literature and tooling on reduced ordered binary decision diagrams to efficiently encode a time unrolled Markov Decision Process. This enables transforming a naïve algorithm with running time exponential in the episode length, into a polynomial time algorithm.
format Online
Article
Text
id pubmed-7363230
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-73632302020-07-16 Maximum Causal Entropy Specification Inference from Demonstrations Vazquez-Chanlatte, Marcell Seshia, Sanjit A. Computer Aided Verification Article In many settings, such as robotics, demonstrations provide a natural way to specify tasks. However, most methods for learning from demonstrations either do not provide guarantees that the learned artifacts can be safely composed or do not explicitly capture temporal properties. Motivated by this deficit, recent works have proposed learning Boolean task specifications, a class of Boolean non-Markovian rewards which admit well-defined composition and explicitly handle historical dependencies. This work continues this line of research by adapting maximum causal entropy inverse reinforcement learning to estimate the posteriori probability of a specification given a multi-set of demonstrations. The key algorithmic insight is to leverage the extensive literature and tooling on reduced ordered binary decision diagrams to efficiently encode a time unrolled Markov Decision Process. This enables transforming a naïve algorithm with running time exponential in the episode length, into a polynomial time algorithm. 2020-06-16 /pmc/articles/PMC7363230/ http://dx.doi.org/10.1007/978-3-030-53291-8_15 Text en © The Author(s) 2020 Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
spellingShingle Article
Vazquez-Chanlatte, Marcell
Seshia, Sanjit A.
Maximum Causal Entropy Specification Inference from Demonstrations
title Maximum Causal Entropy Specification Inference from Demonstrations
title_full Maximum Causal Entropy Specification Inference from Demonstrations
title_fullStr Maximum Causal Entropy Specification Inference from Demonstrations
title_full_unstemmed Maximum Causal Entropy Specification Inference from Demonstrations
title_short Maximum Causal Entropy Specification Inference from Demonstrations
title_sort maximum causal entropy specification inference from demonstrations
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7363230/
http://dx.doi.org/10.1007/978-3-030-53291-8_15
work_keys_str_mv AT vazquezchanlattemarcell maximumcausalentropyspecificationinferencefromdemonstrations
AT seshiasanjita maximumcausalentropyspecificationinferencefromdemonstrations