Cargando…

AutoPyFactory and the Cloud

AutoPyFactory (APF) is a next-generation pilot submission framework that has been used as part of the ATLAS workload management system (PANDA) for two years. APF is reliable, scalable, and offers easy and flexible configuration. Using a plugin-based architecture, APF polls for information from confi...

Descripción completa

Detalles Bibliográficos
Autores principales: Caballero, J, Hover, J, Love, P
Lenguaje:eng
Publicado: 2013
Materias:
Acceso en línea:http://cds.cern.ch/record/1607122
_version_ 1780931712603127808
author Caballero, J
Hover, J
Love, P
author_facet Caballero, J
Hover, J
Love, P
author_sort Caballero, J
collection CERN
description AutoPyFactory (APF) is a next-generation pilot submission framework that has been used as part of the ATLAS workload management system (PANDA) for two years. APF is reliable, scalable, and offers easy and flexible configuration. Using a plugin-based architecture, APF polls for information from configured information and batch systems (including grid sites), decides how many additional pilot jobs are needed, and submits them. With the advent of cloud computing, providing resources goes beyond submitting pilots to grid sites. Now, the resources on which the pilot will run also need to be managed. Handling both pilot submission and controlling the virtual machine life cycle (creation, retirement, and termination) from the same framework allows robust and efficient management of the process. In this paper we describe the design and implementation of these virtual machine management capabilities of APF. Expanding on our plugin-based approach, we allow cascades of virtual resources associated with a job queue. A single workflow can be directed first to a private, facility-based cloud, then a free academic cloud, then spot-priced EC2 resources, and finally on-demand commercial clouds. Limits, weighting, and priorities are supported, allowing free or less expensive resources to be used first, with costly resources only used when necessary. As demand drops, resources are drained and terminated in reverse order. Performance plots and time series will be included, showing how the implementation handles ramp-ups, ramp-downs, and spot terminations.
id cern-1607122
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2013
record_format invenio
spelling cern-16071222019-09-30T06:29:59Zhttp://cds.cern.ch/record/1607122engCaballero, JHover, JLove, PAutoPyFactory and the CloudDetectors and Experimental TechniquesAutoPyFactory (APF) is a next-generation pilot submission framework that has been used as part of the ATLAS workload management system (PANDA) for two years. APF is reliable, scalable, and offers easy and flexible configuration. Using a plugin-based architecture, APF polls for information from configured information and batch systems (including grid sites), decides how many additional pilot jobs are needed, and submits them. With the advent of cloud computing, providing resources goes beyond submitting pilots to grid sites. Now, the resources on which the pilot will run also need to be managed. Handling both pilot submission and controlling the virtual machine life cycle (creation, retirement, and termination) from the same framework allows robust and efficient management of the process. In this paper we describe the design and implementation of these virtual machine management capabilities of APF. Expanding on our plugin-based approach, we allow cascades of virtual resources associated with a job queue. A single workflow can be directed first to a private, facility-based cloud, then a free academic cloud, then spot-priced EC2 resources, and finally on-demand commercial clouds. Limits, weighting, and priorities are supported, allowing free or less expensive resources to be used first, with costly resources only used when necessary. As demand drops, resources are drained and terminated in reverse order. Performance plots and time series will be included, showing how the implementation handles ramp-ups, ramp-downs, and spot terminations.ATL-SOFT-SLIDE-2013-816oai:cds.cern.ch:16071222013-10-09
spellingShingle Detectors and Experimental Techniques
Caballero, J
Hover, J
Love, P
AutoPyFactory and the Cloud
title AutoPyFactory and the Cloud
title_full AutoPyFactory and the Cloud
title_fullStr AutoPyFactory and the Cloud
title_full_unstemmed AutoPyFactory and the Cloud
title_short AutoPyFactory and the Cloud
title_sort autopyfactory and the cloud
topic Detectors and Experimental Techniques
url http://cds.cern.ch/record/1607122
work_keys_str_mv AT caballeroj autopyfactoryandthecloud
AT hoverj autopyfactoryandthecloud
AT lovep autopyfactoryandthecloud