Cargando…

Maximum Causal Entropy Specification Inference from Demonstrations

In many settings, such as robotics, demonstrations provide a natural way to specify tasks. However, most methods for learning from demonstrations either do not provide guarantees that the learned artifacts can be safely composed or do not explicitly capture temporal properties. Motivated by this def...

Descripción completa

Detalles Bibliográficos
Autores principales:	Vazquez-Chanlatte, Marcell, Seshia, Sanjit A.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7363230/ http://dx.doi.org/10.1007/978-3-030-53291-8_15

_version_	1783559628173344768
author	Vazquez-Chanlatte, Marcell Seshia, Sanjit A.
author_facet	Vazquez-Chanlatte, Marcell Seshia, Sanjit A.
author_sort	Vazquez-Chanlatte, Marcell
collection	PubMed
description	In many settings, such as robotics, demonstrations provide a natural way to specify tasks. However, most methods for learning from demonstrations either do not provide guarantees that the learned artifacts can be safely composed or do not explicitly capture temporal properties. Motivated by this deficit, recent works have proposed learning Boolean task specifications, a class of Boolean non-Markovian rewards which admit well-defined composition and explicitly handle historical dependencies. This work continues this line of research by adapting maximum causal entropy inverse reinforcement learning to estimate the posteriori probability of a specification given a multi-set of demonstrations. The key algorithmic insight is to leverage the extensive literature and tooling on reduced ordered binary decision diagrams to efficiently encode a time unrolled Markov Decision Process. This enables transforming a naïve algorithm with running time exponential in the episode length, into a polynomial time algorithm.
format	Online Article Text
id	pubmed-7363230
institution	National Center for Biotechnology Information
language	English
publishDate	2020
record_format	MEDLINE/PubMed
spelling	pubmed-73632302020-07-16 Maximum Causal Entropy Specification Inference from Demonstrations Vazquez-Chanlatte, Marcell Seshia, Sanjit A. Computer Aided Verification Article In many settings, such as robotics, demonstrations provide a natural way to specify tasks. However, most methods for learning from demonstrations either do not provide guarantees that the learned artifacts can be safely composed or do not explicitly capture temporal properties. Motivated by this deficit, recent works have proposed learning Boolean task specifications, a class of Boolean non-Markovian rewards which admit well-defined composition and explicitly handle historical dependencies. This work continues this line of research by adapting maximum causal entropy inverse reinforcement learning to estimate the posteriori probability of a specification given a multi-set of demonstrations. The key algorithmic insight is to leverage the extensive literature and tooling on reduced ordered binary decision diagrams to efficiently encode a time unrolled Markov Decision Process. This enables transforming a naïve algorithm with running time exponential in the episode length, into a polynomial time algorithm. 2020-06-16 /pmc/articles/PMC7363230/ http://dx.doi.org/10.1007/978-3-030-53291-8_15 Text en © The Author(s) 2020 Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
spellingShingle	Article Vazquez-Chanlatte, Marcell Seshia, Sanjit A. Maximum Causal Entropy Specification Inference from Demonstrations
title	Maximum Causal Entropy Specification Inference from Demonstrations
title_full	Maximum Causal Entropy Specification Inference from Demonstrations
title_fullStr	Maximum Causal Entropy Specification Inference from Demonstrations
title_full_unstemmed	Maximum Causal Entropy Specification Inference from Demonstrations
title_short	Maximum Causal Entropy Specification Inference from Demonstrations
title_sort	maximum causal entropy specification inference from demonstrations
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7363230/ http://dx.doi.org/10.1007/978-3-030-53291-8_15
work_keys_str_mv	AT vazquezchanlattemarcell maximumcausalentropyspecificationinferencefromdemonstrations AT seshiasanjita maximumcausalentropyspecificationinferencefromdemonstrations

Maximum Causal Entropy Specification Inference from Demonstrations

Ejemplares similares