Cargando…

Achieving Resiliency in Production Worldwide Grid Services for the Large Hadron Collider at CERN

The world’s coolest machine – also the largest scientific instrument to date – will enter production in 2008. Operating at a temperature below 2oK, the Large Hadron Collider (LHC) at CERN will generate massive amounts of data – some 15PB per year – that will require significant computation...

Descripción completa

Detalles Bibliográficos
Autor principal: Shiers, J
Lenguaje:eng
Publicado: 2007
Materias:
Acceso en línea:http://cds.cern.ch/record/1072763
Descripción
Sumario:The world’s coolest machine – also the largest scientific instrument to date – will enter production in 2008. Operating at a temperature below 2oK, the Large Hadron Collider (LHC) at CERN will generate massive amounts of data – some 15PB per year – that will require significant computational and storage resources. A worldwide production Grid, the Worldwide LHC Computing Grid (WLCG) [1] has been setup, building on the infrastructures of two main Grids – the Open Science Grid (OSG) in the US [2] and the Enabling Grids for E-SciencE in Europe (EGEE) [3] and elsewhere. This is a highly complex system with many components but which must provide a robust and resilient service. This paper describes the state of the Grid in terms of resiliency and is based on a workshop on WLCG Service Reliability held at CERN in November 2007. The goals of the workshop were to discuss and agree the primary techniques for designing, building, deploying and operating robust and resilient services. Concrete targets are to achieve a measurable improvement in service reliability by the time of a WLCG Collaboration workshop in April 2008, and to have fully met the established targets approximately one year later. In this context, reliability is defined as the â€ワability of a system/component to perform its required functions under the stated conditions.”