Cargando…

Predicting queue wait time probabilities for multi-scale computing

We describe a method for queue wait time prediction in supercomputing clusters. It was designed for use as a part of multi-criteria brokering mechanisms for resource selection in a multi-site High Performance Computing environment. The aim is to incorporate the time jobs stay queued in the schedulin...

Descripción completa

Detalles Bibliográficos
Autores principales:	Jancauskas, Vytautas, Piontek, Tomasz, Kopta, Piotr, Bosak, Bartosz
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	The Royal Society Publishing 2019
Materias:	Articles
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6388012/ https://www.ncbi.nlm.nih.gov/pubmed/30967035 http://dx.doi.org/10.1098/rsta.2018.0151

_version_	1783397680858267648
author	Jancauskas, Vytautas Piontek, Tomasz Kopta, Piotr Bosak, Bartosz
author_facet	Jancauskas, Vytautas Piontek, Tomasz Kopta, Piotr Bosak, Bartosz
author_sort	Jancauskas, Vytautas
collection	PubMed
description	We describe a method for queue wait time prediction in supercomputing clusters. It was designed for use as a part of multi-criteria brokering mechanisms for resource selection in a multi-site High Performance Computing environment. The aim is to incorporate the time jobs stay queued in the scheduling system into the selection criteria. Our method can also be used by the end users to estimate the time to completion of their computing jobs. It uses historical data about the particular system to make predictions. It returns a list of probability estimates of the form (t(i), p(i)), where p(i) is the probability that the job will start before time t(i). Times t(i) can be chosen more or less freely when deploying the system. Compared to regression methods that only return a single number as a queue wait time estimate (usually without error bars) our prediction system provides more useful information. The probability estimates are calculated using the Bayes theorem with the naive assumption that the attributes describing the jobs are independent. They are further calibrated to make sure they are as accurate as possible, given available data. We describe our service and its REST API and the underlying methods in detail and provide empirical evidence in support of the method's efficacy. This article is part of the theme issue ‘Multiscale modelling, simulation and computing: from the desktop to the exascale’.
format	Online Article Text
id	pubmed-6388012
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	The Royal Society Publishing
record_format	MEDLINE/PubMed
spelling	pubmed-63880122019-02-28 Predicting queue wait time probabilities for multi-scale computing Jancauskas, Vytautas Piontek, Tomasz Kopta, Piotr Bosak, Bartosz Philos Trans A Math Phys Eng Sci Articles We describe a method for queue wait time prediction in supercomputing clusters. It was designed for use as a part of multi-criteria brokering mechanisms for resource selection in a multi-site High Performance Computing environment. The aim is to incorporate the time jobs stay queued in the scheduling system into the selection criteria. Our method can also be used by the end users to estimate the time to completion of their computing jobs. It uses historical data about the particular system to make predictions. It returns a list of probability estimates of the form (t(i), p(i)), where p(i) is the probability that the job will start before time t(i). Times t(i) can be chosen more or less freely when deploying the system. Compared to regression methods that only return a single number as a queue wait time estimate (usually without error bars) our prediction system provides more useful information. The probability estimates are calculated using the Bayes theorem with the naive assumption that the attributes describing the jobs are independent. They are further calibrated to make sure they are as accurate as possible, given available data. We describe our service and its REST API and the underlying methods in detail and provide empirical evidence in support of the method's efficacy. This article is part of the theme issue ‘Multiscale modelling, simulation and computing: from the desktop to the exascale’. The Royal Society Publishing 2019-04-08 2019-02-18 /pmc/articles/PMC6388012/ /pubmed/30967035 http://dx.doi.org/10.1098/rsta.2018.0151 Text en © 2019 The Authors. http://creativecommons.org/licenses/by/4.0/ Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.
spellingShingle	Articles Jancauskas, Vytautas Piontek, Tomasz Kopta, Piotr Bosak, Bartosz Predicting queue wait time probabilities for multi-scale computing
title	Predicting queue wait time probabilities for multi-scale computing
title_full	Predicting queue wait time probabilities for multi-scale computing
title_fullStr	Predicting queue wait time probabilities for multi-scale computing
title_full_unstemmed	Predicting queue wait time probabilities for multi-scale computing
title_short	Predicting queue wait time probabilities for multi-scale computing
title_sort	predicting queue wait time probabilities for multi-scale computing
topic	Articles
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6388012/ https://www.ncbi.nlm.nih.gov/pubmed/30967035 http://dx.doi.org/10.1098/rsta.2018.0151
work_keys_str_mv	AT jancauskasvytautas predictingqueuewaittimeprobabilitiesformultiscalecomputing AT piontektomasz predictingqueuewaittimeprobabilitiesformultiscalecomputing AT koptapiotr predictingqueuewaittimeprobabilitiesformultiscalecomputing AT bosakbartosz predictingqueuewaittimeprobabilitiesformultiscalecomputing

Predicting queue wait time probabilities for multi-scale computing

Ejemplares similares