Cargando…

HammerCloud: A Stress Testing System for Distributed Analysis

Distributed analysis of LHC data is an I/O-intensive activity which places large demands on the internal network, storage, and local disks at remote computing facilities. Commissioning and maintaining a site to provide an efficient distributed analysis service is therefore a challenge which can be a...

Descripción completa

Detalles Bibliográficos
Autores principales: van der Ster, Daniel C, Elmsheuser, Johannes, Ubeda Garcia, Mario, Paladin, Massimo
Lenguaje:eng
Publicado: 2011
Materias:
Acceso en línea:http://cds.cern.ch/record/1319378
_version_ 1780921460241465344
author van der Ster, Daniel C
Elmsheuser, Johannes
Ubeda Garcia, Mario
Paladin, Massimo
author_facet van der Ster, Daniel C
Elmsheuser, Johannes
Ubeda Garcia, Mario
Paladin, Massimo
author_sort van der Ster, Daniel C
collection CERN
description Distributed analysis of LHC data is an I/O-intensive activity which places large demands on the internal network, storage, and local disks at remote computing facilities. Commissioning and maintaining a site to provide an efficient distributed analysis service is therefore a challenge which can be aided by tools to help evaluate a variety of infrastructure designs and configurations. HammerCloud (HC) is one such tool; it is a stress testing service which is used by central operations teams, regional coordinators, and local site admins to (a) submit arbitrary number of analysis jobs to a number of sites, (b) maintain at a steady-state a predefined number of jobs running at the sites under test, (c) produce web-based reports summarizing the efficiency and performance of the sites under test, and (d) present a web-interface for historical test results to both evaluate progress and compare sites. HC was built around the distributed analysis framework Ganga, exploiting its API for grid job management. HC has been employed by the ATLAS experiment for continuous testing of many sites worldwide, and also during large scale computing challenges such as STEP'09 and UAT'09, where the scale of the tests exceeded 10,000 concurrently running and 1,000,000 total jobs over multi-day periods. In addition, HC is being adopted by the CMS experiment; the plugin structure of HC allows the execution of CMS jobs using their official tool (CRAB).
id cern-1319378
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2011
record_format invenio
spelling cern-13193782019-09-30T06:29:59Zhttp://cds.cern.ch/record/1319378engvan der Ster, Daniel CElmsheuser, JohannesUbeda Garcia, MarioPaladin, MassimoHammerCloud: A Stress Testing System for Distributed AnalysisComputing and ComputersDistributed analysis of LHC data is an I/O-intensive activity which places large demands on the internal network, storage, and local disks at remote computing facilities. Commissioning and maintaining a site to provide an efficient distributed analysis service is therefore a challenge which can be aided by tools to help evaluate a variety of infrastructure designs and configurations. HammerCloud (HC) is one such tool; it is a stress testing service which is used by central operations teams, regional coordinators, and local site admins to (a) submit arbitrary number of analysis jobs to a number of sites, (b) maintain at a steady-state a predefined number of jobs running at the sites under test, (c) produce web-based reports summarizing the efficiency and performance of the sites under test, and (d) present a web-interface for historical test results to both evaluate progress and compare sites. HC was built around the distributed analysis framework Ganga, exploiting its API for grid job management. HC has been employed by the ATLAS experiment for continuous testing of many sites worldwide, and also during large scale computing challenges such as STEP'09 and UAT'09, where the scale of the tests exceeded 10,000 concurrently running and 1,000,000 total jobs over multi-day periods. In addition, HC is being adopted by the CMS experiment; the plugin structure of HC allows the execution of CMS jobs using their official tool (CRAB).CERN-IT-2011-001oai:cds.cern.ch:13193782011-01-06
spellingShingle Computing and Computers
van der Ster, Daniel C
Elmsheuser, Johannes
Ubeda Garcia, Mario
Paladin, Massimo
HammerCloud: A Stress Testing System for Distributed Analysis
title HammerCloud: A Stress Testing System for Distributed Analysis
title_full HammerCloud: A Stress Testing System for Distributed Analysis
title_fullStr HammerCloud: A Stress Testing System for Distributed Analysis
title_full_unstemmed HammerCloud: A Stress Testing System for Distributed Analysis
title_short HammerCloud: A Stress Testing System for Distributed Analysis
title_sort hammercloud: a stress testing system for distributed analysis
topic Computing and Computers
url http://cds.cern.ch/record/1319378
work_keys_str_mv AT vandersterdanielc hammercloudastresstestingsystemfordistributedanalysis
AT elmsheuserjohannes hammercloudastresstestingsystemfordistributedanalysis
AT ubedagarciamario hammercloudastresstestingsystemfordistributedanalysis
AT paladinmassimo hammercloudastresstestingsystemfordistributedanalysis