Cargando…

CHEP2012

The CMS distributed data analysis workflow assumes that jobs run in a different location from where their results are finally stored. Typically the user output must be transferred across the network from one site to another, possibly on a different continent or over links not necessarily validated f...

Descripción completa

Detalles Bibliográficos
Autores principales: Cinquilli, M, Riahi, H, Spiga, D, Grandi, C, Mascheroni, M, Pepe, F, Vaandering, E
Lenguaje:eng
Publicado: 2012
Materias:
Acceso en línea:http://cds.cern.ch/record/1457974
_version_ 1780925147755053056
author Cinquilli, M
Riahi, H
Spiga, D
Grandi, C
Mascheroni, M
Pepe, F
Vaandering, E
author_facet Cinquilli, M
Riahi, H
Spiga, D
Grandi, C
Mascheroni, M
Pepe, F
Vaandering, E
author_sort Cinquilli, M
collection CERN
description The CMS distributed data analysis workflow assumes that jobs run in a different location from where their results are finally stored. Typically the user output must be transferred across the network from one site to another, possibly on a different continent or over links not necessarily validated for high bandwidth/high reliability transfer. This step is named extit{stage-out} and in CMS was originally implemented as a synchronous step of the analysis job execution. However, our experience showed the weakness of this approach both in terms of low total job execution efficiency and failure rates, wasting precious CPU resources. The nature of analysis data makes it inappropriate to use PhEDEx, the core data placement system for CMS. As part of the new generation of CMS Workload Management tools, the Asynchronous Stage-Out system (AsyncStageOut) has been developed to enable third party copy of the user output. The AsyncStageOut component manages glite FTS transfers of data from the temporary store at the site where the job ran to the final location of the data on behalf of that data owner. The tool uses python daemons, built using the WMCore framework, and CouchDB, to manage the queue of work and FTS transfers. CouchDB also provides the platform for a dedicated operations monitoring system. In this paper, we present the motivations of the asynchronous stage-out system. We give an insight into the design and the implementation of key features, describing how it is coupled with the CMS workload management system. Finally, we show the results and the commissioning experience.
id cern-1457974
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2012
record_format invenio
spelling cern-14579742019-09-30T06:29:59Zhttp://cds.cern.ch/record/1457974engCinquilli, MRiahi, HSpiga, DGrandi, CMascheroni, MPepe, FVaandering, ECHEP2012Computing and ComputersThe CMS distributed data analysis workflow assumes that jobs run in a different location from where their results are finally stored. Typically the user output must be transferred across the network from one site to another, possibly on a different continent or over links not necessarily validated for high bandwidth/high reliability transfer. This step is named extit{stage-out} and in CMS was originally implemented as a synchronous step of the analysis job execution. However, our experience showed the weakness of this approach both in terms of low total job execution efficiency and failure rates, wasting precious CPU resources. The nature of analysis data makes it inappropriate to use PhEDEx, the core data placement system for CMS. As part of the new generation of CMS Workload Management tools, the Asynchronous Stage-Out system (AsyncStageOut) has been developed to enable third party copy of the user output. The AsyncStageOut component manages glite FTS transfers of data from the temporary store at the site where the job ran to the final location of the data on behalf of that data owner. The tool uses python daemons, built using the WMCore framework, and CouchDB, to manage the queue of work and FTS transfers. CouchDB also provides the platform for a dedicated operations monitoring system. In this paper, we present the motivations of the asynchronous stage-out system. We give an insight into the design and the implementation of key features, describing how it is coupled with the CMS workload management system. Finally, we show the results and the commissioning experience.CERN-IT-Note-2012-010IOP Conference Series, CHEP2012 Proceedingsoai:cds.cern.ch:14579742012-06-26
spellingShingle Computing and Computers
Cinquilli, M
Riahi, H
Spiga, D
Grandi, C
Mascheroni, M
Pepe, F
Vaandering, E
CHEP2012
title CHEP2012
title_full CHEP2012
title_fullStr CHEP2012
title_full_unstemmed CHEP2012
title_short CHEP2012
title_sort chep2012
topic Computing and Computers
url http://cds.cern.ch/record/1457974
work_keys_str_mv AT cinquillim chep2012
AT riahih chep2012
AT spigad chep2012
AT grandic chep2012
AT mascheronim chep2012
AT pepef chep2012
AT vaanderinge chep2012