Cargando…
Maximum Causal Entropy Specification Inference from Demonstrations
In many settings, such as robotics, demonstrations provide a natural way to specify tasks. However, most methods for learning from demonstrations either do not provide guarantees that the learned artifacts can be safely composed or do not explicitly capture temporal properties. Motivated by this def...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7363230/ http://dx.doi.org/10.1007/978-3-030-53291-8_15 |
_version_ | 1783559628173344768 |
---|---|
author | Vazquez-Chanlatte, Marcell Seshia, Sanjit A. |
author_facet | Vazquez-Chanlatte, Marcell Seshia, Sanjit A. |
author_sort | Vazquez-Chanlatte, Marcell |
collection | PubMed |
description | In many settings, such as robotics, demonstrations provide a natural way to specify tasks. However, most methods for learning from demonstrations either do not provide guarantees that the learned artifacts can be safely composed or do not explicitly capture temporal properties. Motivated by this deficit, recent works have proposed learning Boolean task specifications, a class of Boolean non-Markovian rewards which admit well-defined composition and explicitly handle historical dependencies. This work continues this line of research by adapting maximum causal entropy inverse reinforcement learning to estimate the posteriori probability of a specification given a multi-set of demonstrations. The key algorithmic insight is to leverage the extensive literature and tooling on reduced ordered binary decision diagrams to efficiently encode a time unrolled Markov Decision Process. This enables transforming a naïve algorithm with running time exponential in the episode length, into a polynomial time algorithm. |
format | Online Article Text |
id | pubmed-7363230 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-73632302020-07-16 Maximum Causal Entropy Specification Inference from Demonstrations Vazquez-Chanlatte, Marcell Seshia, Sanjit A. Computer Aided Verification Article In many settings, such as robotics, demonstrations provide a natural way to specify tasks. However, most methods for learning from demonstrations either do not provide guarantees that the learned artifacts can be safely composed or do not explicitly capture temporal properties. Motivated by this deficit, recent works have proposed learning Boolean task specifications, a class of Boolean non-Markovian rewards which admit well-defined composition and explicitly handle historical dependencies. This work continues this line of research by adapting maximum causal entropy inverse reinforcement learning to estimate the posteriori probability of a specification given a multi-set of demonstrations. The key algorithmic insight is to leverage the extensive literature and tooling on reduced ordered binary decision diagrams to efficiently encode a time unrolled Markov Decision Process. This enables transforming a naïve algorithm with running time exponential in the episode length, into a polynomial time algorithm. 2020-06-16 /pmc/articles/PMC7363230/ http://dx.doi.org/10.1007/978-3-030-53291-8_15 Text en © The Author(s) 2020 Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. |
spellingShingle | Article Vazquez-Chanlatte, Marcell Seshia, Sanjit A. Maximum Causal Entropy Specification Inference from Demonstrations |
title | Maximum Causal Entropy Specification Inference from Demonstrations |
title_full | Maximum Causal Entropy Specification Inference from Demonstrations |
title_fullStr | Maximum Causal Entropy Specification Inference from Demonstrations |
title_full_unstemmed | Maximum Causal Entropy Specification Inference from Demonstrations |
title_short | Maximum Causal Entropy Specification Inference from Demonstrations |
title_sort | maximum causal entropy specification inference from demonstrations |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7363230/ http://dx.doi.org/10.1007/978-3-030-53291-8_15 |
work_keys_str_mv | AT vazquezchanlattemarcell maximumcausalentropyspecificationinferencefromdemonstrations AT seshiasanjita maximumcausalentropyspecificationinferencefromdemonstrations |