Cargando…

Jobs masonry in LHCb with elastic Grid Jobs

In any distributed computing infrastructure, a job is normally forbidden to run for an indefinite amount of time. This limitation is implemented using different technologies, the most common one being the CPU time limit implemented by batch queues. It is therefore important to have a good estimate o...

Descripción completa

Detalles Bibliográficos
Autores principales: Stagni, F, Charpentier, Ph
Lenguaje:eng
Publicado: 2015
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/664/6/062060
http://cds.cern.ch/record/2019802
_version_ 1780946819303342080
author Stagni, F
Charpentier, Ph
author_facet Stagni, F
Charpentier, Ph
author_sort Stagni, F
collection CERN
description In any distributed computing infrastructure, a job is normally forbidden to run for an indefinite amount of time. This limitation is implemented using different technologies, the most common one being the CPU time limit implemented by batch queues. It is therefore important to have a good estimate of how much CPU work a job will require: otherwise, it might be killed by the batch system, or by whatever system is controlling the jobs' execution. In many modern interwares, the jobs are actually executed by pilot jobs, that can use the whole available time in running multiple consecutive jobs. If at some point the available time in a pilot is too short for the execution of any job, it should be released, while it could have been used efficiently by a shorter job. Within LHCbDIRAC, the LHCb extension of the DIRAC interware, we developed a simple way to fully exploit computing capabilities available to a pilot, even for resources with limited time capabilities, by adding elasticity to production MonteCarlo (MC) simulation jobs. With our approach, independently of the time available, LHCbDIRAC will always have the possibility to execute a MC job, whose length will be adapted to the available amount of time: therefore the same job, running on different computing resources with different time limits, will produce different amounts of events. The decision on the number of events to be produced is made just in time at the start of the job, when the capabilities of the resource are known. In order to know how many events a MC job will be instructed to produce, LHCbDIRAC simply requires three values: the CPU-work per event for that type of job, the power of the machine it is running on, and the time left for the job before being killed. Knowing these values, we can estimate the number of events the job will be able to simulate with the available CPU time. This paper will demonstrate that, using this simple but effective solution, LHCb manages to make a more efficient use of the available resources, and that it can easily use new types of resources. An example is represented by resources provided by batch queues, where low-priority MC jobs can be used as "masonry" jobs in multi-jobs pilots. A second example is represented by opportunistic resources with limited available time.
id cern-2019802
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2015
record_format invenio
spelling cern-20198022022-08-10T13:00:43Zdoi:10.1088/1742-6596/664/6/062060http://cds.cern.ch/record/2019802engStagni, FCharpentier, PhJobs masonry in LHCb with elastic Grid JobsParticle Physics - ExperimentIn any distributed computing infrastructure, a job is normally forbidden to run for an indefinite amount of time. This limitation is implemented using different technologies, the most common one being the CPU time limit implemented by batch queues. It is therefore important to have a good estimate of how much CPU work a job will require: otherwise, it might be killed by the batch system, or by whatever system is controlling the jobs' execution. In many modern interwares, the jobs are actually executed by pilot jobs, that can use the whole available time in running multiple consecutive jobs. If at some point the available time in a pilot is too short for the execution of any job, it should be released, while it could have been used efficiently by a shorter job. Within LHCbDIRAC, the LHCb extension of the DIRAC interware, we developed a simple way to fully exploit computing capabilities available to a pilot, even for resources with limited time capabilities, by adding elasticity to production MonteCarlo (MC) simulation jobs. With our approach, independently of the time available, LHCbDIRAC will always have the possibility to execute a MC job, whose length will be adapted to the available amount of time: therefore the same job, running on different computing resources with different time limits, will produce different amounts of events. The decision on the number of events to be produced is made just in time at the start of the job, when the capabilities of the resource are known. In order to know how many events a MC job will be instructed to produce, LHCbDIRAC simply requires three values: the CPU-work per event for that type of job, the power of the machine it is running on, and the time left for the job before being killed. Knowing these values, we can estimate the number of events the job will be able to simulate with the available CPU time. This paper will demonstrate that, using this simple but effective solution, LHCb manages to make a more efficient use of the available resources, and that it can easily use new types of resources. An example is represented by resources provided by batch queues, where low-priority MC jobs can be used as "masonry" jobs in multi-jobs pilots. A second example is represented by opportunistic resources with limited available time.LHCb-PROC-2015-015CERN-LHCb-PROC-2015-015oai:cds.cern.ch:20198022015-05-27
spellingShingle Particle Physics - Experiment
Stagni, F
Charpentier, Ph
Jobs masonry in LHCb with elastic Grid Jobs
title Jobs masonry in LHCb with elastic Grid Jobs
title_full Jobs masonry in LHCb with elastic Grid Jobs
title_fullStr Jobs masonry in LHCb with elastic Grid Jobs
title_full_unstemmed Jobs masonry in LHCb with elastic Grid Jobs
title_short Jobs masonry in LHCb with elastic Grid Jobs
title_sort jobs masonry in lhcb with elastic grid jobs
topic Particle Physics - Experiment
url https://dx.doi.org/10.1088/1742-6596/664/6/062060
http://cds.cern.ch/record/2019802
work_keys_str_mv AT stagnif jobsmasonryinlhcbwithelasticgridjobs
AT charpentierph jobsmasonryinlhcbwithelasticgridjobs