Cargando…

Exploitation of network-segregated CPU resources in CMS

CMS is tackling the exploitation of CPU resources at HPC centers where compute nodes do not have network connectivity to the Internet. Pilot agents and payload jobs need to interact with external services from the compute nodes access to the application software (cmvfs) and conditions data (Frontie...

Descripción completa

Detalles Bibliográficos
Autores principales:	Acosta-Silva, C, Peris, A Delgado, Flix, J, Frey, J, Hernández, J M, Pérez-Calero Yzquierdo, A, Tannenbaum, T
Lenguaje:	eng
Publicado:	2021
Materias:	Detectors and Experimental Techniques Computing and Computers
Acceso en línea:	https://dx.doi.org/10.1051/epjconf/202125102020 http://cds.cern.ch/record/2797500

_version_	1780972395502239744
author	Acosta-Silva, C Peris, A Delgado Flix, J Frey, J Hernández, J M Pérez-Calero Yzquierdo, A Tannenbaum, T
author_facet	Acosta-Silva, C Peris, A Delgado Flix, J Frey, J Hernández, J M Pérez-Calero Yzquierdo, A Tannenbaum, T
author_sort	Acosta-Silva, C
collection	CERN
description	CMS is tackling the exploitation of CPU resources at HPC centers where compute nodes do not have network connectivity to the Internet. Pilot agents and payload jobs need to interact with external services from the compute nodes access to the application software (cmvfs) and conditions data (Frontier), management of input and output data files (data management services), and job management (HTCondor). Finding an alternative route to these services is challenging. Seamless integration in the CMS production system without causing any operational overhead is a key goal.We describe in this paper the solutions developed within CMS to overcome the restrictions imposed by network-segregated compute nodes. The Barcelona Supercomputing Center (BSC) in Spain has been used as a testbed for the integration in production of this kind of resource. Singularity containers with application software releases are built and pre-placed in the HPC shared file system together with conditions data files. HTCondor has been extended to relay communications between running pilot jobs and HTCondor daemons through the HPC shared file system. This operation mode also allows piping input and output data files through the HPC file system. Results, issues encountered during the integration process, and remaining concerns are discussed in this report.
id	cern-2797500
institution	Organización Europea para la Investigación Nuclear
language	eng
publishDate	2021
record_format	invenio
spelling	cern-27975002022-08-23T08:59:02Zdoi:10.1051/epjconf/202125102020http://cds.cern.ch/record/2797500engAcosta-Silva, CPeris, A DelgadoFlix, JFrey, JHernández, J MPérez-Calero Yzquierdo, ATannenbaum, TExploitation of network-segregated CPU resources in CMSDetectors and Experimental TechniquesComputing and ComputersCMS is tackling the exploitation of CPU resources at HPC centers where compute nodes do not have network connectivity to the Internet. Pilot agents and payload jobs need to interact with external services from the compute nodes access to the application software (cmvfs) and conditions data (Frontier), management of input and output data files (data management services), and job management (HTCondor). Finding an alternative route to these services is challenging. Seamless integration in the CMS production system without causing any operational overhead is a key goal.We describe in this paper the solutions developed within CMS to overcome the restrictions imposed by network-segregated compute nodes. The Barcelona Supercomputing Center (BSC) in Spain has been used as a testbed for the integration in production of this kind of resource. Singularity containers with application software releases are built and pre-placed in the HPC shared file system together with conditions data files. HTCondor has been extended to relay communications between running pilot jobs and HTCondor daemons through the HPC shared file system. This operation mode also allows piping input and output data files through the HPC file system. Results, issues encountered during the integration process, and remaining concerns are discussed in this report.CMS is tackling the exploitation of CPU resources at HPC centers where compute nodes do not have network connectivity to the Internet. Pilot agents and payload jobs need to interact with external services from the compute nodes: access to the application software (CernVM-FS) and conditions data (Frontier), management of input and output data files (data management services), and job management (HTCondor). Finding an alternative route to these services is challenging. Seamless integration in the CMS production system without causing any operational overhead is a key goal. The case of the Barcelona Supercomputing Center (BSC), in Spain, is particularly challenging, due to its especially restrictive network setup. We describe in this paper the solutions developed within CMS to overcome these restrictions, and integrate this resource in production. Singularity containers with application software releases are built and pre-placed in the HPC facility shared file system, together with conditions data files. HTCondor has been extended to relay communications between running pilot jobs and HTCondor daemons through the HPC shared file system. This operation mode also allows piping input and output data files through the HPC file system. Results, issues encountered during the integration process, and remaining concerns are discussed.CMS-CR-2021-017oai:cds.cern.ch:27975002021-02-19
spellingShingle	Detectors and Experimental Techniques Computing and Computers Acosta-Silva, C Peris, A Delgado Flix, J Frey, J Hernández, J M Pérez-Calero Yzquierdo, A Tannenbaum, T Exploitation of network-segregated CPU resources in CMS
title	Exploitation of network-segregated CPU resources in CMS
title_full	Exploitation of network-segregated CPU resources in CMS
title_fullStr	Exploitation of network-segregated CPU resources in CMS
title_full_unstemmed	Exploitation of network-segregated CPU resources in CMS
title_short	Exploitation of network-segregated CPU resources in CMS
title_sort	exploitation of network-segregated cpu resources in cms
topic	Detectors and Experimental Techniques Computing and Computers
url	https://dx.doi.org/10.1051/epjconf/202125102020 http://cds.cern.ch/record/2797500
work_keys_str_mv	AT acostasilvac exploitationofnetworksegregatedcpuresourcesincms AT perisadelgado exploitationofnetworksegregatedcpuresourcesincms AT flixj exploitationofnetworksegregatedcpuresourcesincms AT freyj exploitationofnetworksegregatedcpuresourcesincms AT hernandezjm exploitationofnetworksegregatedcpuresourcesincms AT perezcaleroyzquierdoa exploitationofnetworksegregatedcpuresourcesincms AT tannenbaumt exploitationofnetworksegregatedcpuresourcesincms

Exploitation of network-segregated CPU resources in CMS

Ejemplares similares