Cargando…
Optimistic Value Iteration
Markov decision processes are widely used for planning and verification in settings that combine controllable or adversarial choices with probabilistic behaviour. The standard analysis algorithm, value iteration, only provides lower bounds on infinite-horizon probabilities and rewards. Two “sound” v...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7363440/ http://dx.doi.org/10.1007/978-3-030-53291-8_26 |
_version_ | 1783559655715241984 |
---|---|
author | Hartmanns, Arnd Kaminski, Benjamin Lucien |
author_facet | Hartmanns, Arnd Kaminski, Benjamin Lucien |
author_sort | Hartmanns, Arnd |
collection | PubMed |
description | Markov decision processes are widely used for planning and verification in settings that combine controllable or adversarial choices with probabilistic behaviour. The standard analysis algorithm, value iteration, only provides lower bounds on infinite-horizon probabilities and rewards. Two “sound” variations, which also deliver an upper bound, have recently appeared. In this paper, we present a new sound approach that leverages value iteration’s ability to usually deliver good lower bounds: we obtain a lower bound via standard value iteration, use the result to “guess” an upper bound, and prove the latter’s correctness. We present this optimistic value iteration approach for computing reachability probabilities as well as expected rewards. It is easy to implement and performs well, as we show via an extensive experimental evaluation using our implementation within the mcsta model checker of the Modest Toolset. |
format | Online Article Text |
id | pubmed-7363440 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-73634402020-07-16 Optimistic Value Iteration Hartmanns, Arnd Kaminski, Benjamin Lucien Computer Aided Verification Article Markov decision processes are widely used for planning and verification in settings that combine controllable or adversarial choices with probabilistic behaviour. The standard analysis algorithm, value iteration, only provides lower bounds on infinite-horizon probabilities and rewards. Two “sound” variations, which also deliver an upper bound, have recently appeared. In this paper, we present a new sound approach that leverages value iteration’s ability to usually deliver good lower bounds: we obtain a lower bound via standard value iteration, use the result to “guess” an upper bound, and prove the latter’s correctness. We present this optimistic value iteration approach for computing reachability probabilities as well as expected rewards. It is easy to implement and performs well, as we show via an extensive experimental evaluation using our implementation within the mcsta model checker of the Modest Toolset. 2020-06-16 /pmc/articles/PMC7363440/ http://dx.doi.org/10.1007/978-3-030-53291-8_26 Text en © The Author(s) 2020 Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. |
spellingShingle | Article Hartmanns, Arnd Kaminski, Benjamin Lucien Optimistic Value Iteration |
title | Optimistic Value Iteration |
title_full | Optimistic Value Iteration |
title_fullStr | Optimistic Value Iteration |
title_full_unstemmed | Optimistic Value Iteration |
title_short | Optimistic Value Iteration |
title_sort | optimistic value iteration |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7363440/ http://dx.doi.org/10.1007/978-3-030-53291-8_26 |
work_keys_str_mv | AT hartmannsarnd optimisticvalueiteration AT kaminskibenjaminlucien optimisticvalueiteration |