Cargando…

AsyncStageOut: Distributed User Data Management for CMS Analysis

AsyncStageOut (ASO) is a new component of the distributed data analysis system of CMS, CRAB, designed for managing users' data. It addresses a major weakness of the previous model, namely that mass storage of output data was part of the job execution resulting in inefficient use of job slots an...

Descripción completa

Detalles Bibliográficos
Autores principales: Riahi, H, Wildish, T, Ciangottini, D, Hernández, J M, Andreeva, J, Balcas, J, Karavakis, E, Mascheroni, M, Tanasijczuk, A J, Vaandering, E W
Lenguaje:eng
Publicado: 2015
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/664/6/062052
http://cds.cern.ch/record/2134613
_version_ 1780949916151971840
author Riahi, H
Wildish, T
Ciangottini, D
Hernández, J M
Andreeva, J
Balcas, J
Karavakis, E
Mascheroni, M
Tanasijczuk, A J
Vaandering, E W
author_facet Riahi, H
Wildish, T
Ciangottini, D
Hernández, J M
Andreeva, J
Balcas, J
Karavakis, E
Mascheroni, M
Tanasijczuk, A J
Vaandering, E W
author_sort Riahi, H
collection CERN
description AsyncStageOut (ASO) is a new component of the distributed data analysis system of CMS, CRAB, designed for managing users' data. It addresses a major weakness of the previous model, namely that mass storage of output data was part of the job execution resulting in inefficient use of job slots and an unacceptable failure rate at the end of the jobs. ASO foresees the management of up to 400k files per day of various sizes, spread worldwide across more than 60 sites. It must handle up to 1000 individual users per month, and work with minimal delay. This creates challenging requirements for system scalability, performance and monitoring. ASO uses FTS to schedule and execute the transfers between the storage elements of the source and destination sites. It has evolved from a limited prototype to a highly adaptable service, which manages and monitors the user file placement and bookkeeping. To ensure system scalability and data monitoring, it employs new technologies such as a NoSQL database and re-uses existing components of PhEDEx and the FTS Dashboard. We present the asynchronous stage-out strategy and the architecture of the solution we implemented to deal with those issues and challenges. The deployment model for the high availability and scalability of the service is discussed. The performance of the system during the commissioning and the first phase of production are also shown, along with results from simulations designed to explore the limits of scalability.
id oai-inspirehep.net-1413970
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2015
record_format invenio
spelling oai-inspirehep.net-14139702022-08-10T13:01:01Zdoi:10.1088/1742-6596/664/6/062052http://cds.cern.ch/record/2134613engRiahi, HWildish, TCiangottini, DHernández, J MAndreeva, JBalcas, JKaravakis, EMascheroni, MTanasijczuk, A JVaandering, E WAsyncStageOut: Distributed User Data Management for CMS AnalysisComputing and ComputersAsyncStageOut (ASO) is a new component of the distributed data analysis system of CMS, CRAB, designed for managing users' data. It addresses a major weakness of the previous model, namely that mass storage of output data was part of the job execution resulting in inefficient use of job slots and an unacceptable failure rate at the end of the jobs. ASO foresees the management of up to 400k files per day of various sizes, spread worldwide across more than 60 sites. It must handle up to 1000 individual users per month, and work with minimal delay. This creates challenging requirements for system scalability, performance and monitoring. ASO uses FTS to schedule and execute the transfers between the storage elements of the source and destination sites. It has evolved from a limited prototype to a highly adaptable service, which manages and monitors the user file placement and bookkeeping. To ensure system scalability and data monitoring, it employs new technologies such as a NoSQL database and re-uses existing components of PhEDEx and the FTS Dashboard. We present the asynchronous stage-out strategy and the architecture of the solution we implemented to deal with those issues and challenges. The deployment model for the high availability and scalability of the service is discussed. The performance of the system during the commissioning and the first phase of production are also shown, along with results from simulations designed to explore the limits of scalability.FERMILAB-CONF-15-605-CDoai:inspirehep.net:14139702015
spellingShingle Computing and Computers
Riahi, H
Wildish, T
Ciangottini, D
Hernández, J M
Andreeva, J
Balcas, J
Karavakis, E
Mascheroni, M
Tanasijczuk, A J
Vaandering, E W
AsyncStageOut: Distributed User Data Management for CMS Analysis
title AsyncStageOut: Distributed User Data Management for CMS Analysis
title_full AsyncStageOut: Distributed User Data Management for CMS Analysis
title_fullStr AsyncStageOut: Distributed User Data Management for CMS Analysis
title_full_unstemmed AsyncStageOut: Distributed User Data Management for CMS Analysis
title_short AsyncStageOut: Distributed User Data Management for CMS Analysis
title_sort asyncstageout: distributed user data management for cms analysis
topic Computing and Computers
url https://dx.doi.org/10.1088/1742-6596/664/6/062052
http://cds.cern.ch/record/2134613
work_keys_str_mv AT riahih asyncstageoutdistributeduserdatamanagementforcmsanalysis
AT wildisht asyncstageoutdistributeduserdatamanagementforcmsanalysis
AT ciangottinid asyncstageoutdistributeduserdatamanagementforcmsanalysis
AT hernandezjm asyncstageoutdistributeduserdatamanagementforcmsanalysis
AT andreevaj asyncstageoutdistributeduserdatamanagementforcmsanalysis
AT balcasj asyncstageoutdistributeduserdatamanagementforcmsanalysis
AT karavakise asyncstageoutdistributeduserdatamanagementforcmsanalysis
AT mascheronim asyncstageoutdistributeduserdatamanagementforcmsanalysis
AT tanasijczukaj asyncstageoutdistributeduserdatamanagementforcmsanalysis
AT vaanderingew asyncstageoutdistributeduserdatamanagementforcmsanalysis