Cargando…

Elastic Extension of a CMS Computing Centre Resources on External Clouds

After the successful LHC data taking in Run-I and in view of the future runs, the LHC experiments are facing new challenges in the design and operation of the computing facilities. The computing infrastructure for Run-II is dimensioned to cope at most with the average amount of data recorded. The us...

Descripción completa

Detalles Bibliográficos
Autores principales: Codispoti, G, Di Maria, R, Aiftimiei, C, Bonacorsi, D, Calligola, P, Ciaschini, V, Costantini, A, Dal Pra, S, DeGirolamo, D, Grandi, C, Michelotto, D, Panella, M, Peco, G, Sapunenko, V, Sgaravatto, M, Taneja, S, Zizzi, G
Lenguaje:eng
Publicado: 2016
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/762/1/012013
http://cds.cern.ch/record/2277799
Descripción
Sumario:After the successful LHC data taking in Run-I and in view of the future runs, the LHC experiments are facing new challenges in the design and operation of the computing facilities. The computing infrastructure for Run-II is dimensioned to cope at most with the average amount of data recorded. The usage peaks, as already observed in Run-I, may however originate large backlogs, thus delaying the completion of the data reconstruction and ultimately the data availability for physics analysis. In order to cope with the production peaks, CMS - along the lines followed by other LHC experiments - is exploring the opportunity to access Cloud resources provided by external partners or commercial providers. Specific use cases have already been explored and successfully exploited during Long Shutdown 1 (LS1) and the first part of Run 2. In this work we present the proof of concept of the elastic extension of a CMS site, specifically the Bologna Tier-3, on an external OpenStack infrastructure. We focus on the “Cloud Bursting” of a CMS Grid site using a newly designed LSF configuration that allows the dynamic registration of new worker nodes to LSF. In this approach, the dynamically added worker nodes instantiated on the OpenStack infrastructure are transparently accessed by the LHC Grid tools and at the same time they serve as an extension of the farm for the local usage. The amount of resources allocated thus can be elastically modeled to cope up with the needs of CMS experiment and local users. Moreover, a direct access/integration of OpenStack resources to the CMS workload management system is explored. In this paper we present this approach, we report on the performances of the on-demand allocated resources, and we discuss the lessons learned and the next steps.