Cargando…

Fine-grained processing towards HL-LHC computing in ATLAS

During LHC's Run-2 ATLAS has been developing and evaluating new fine-grained approaches to workflows and dataflows able to better utilize computing resources in terms of storage, processing and networks. The compute-limited physics of ATLAS has driven the collaboration to aggressively harvest o...

Descripción completa

Detalles Bibliográficos
Autores principales: Benjamin, Douglas, Calafiura, Paolo, Childers, John Taylor, De, Kaushik, Di Girolamo, Alessandro, Fullana Torregrosa, Esteban, Guan, Wen, Maeno, Tadashi, Magini, Nicolo, Nilsson, Paul, Oleynik, Danila, Sun, Shaojun, Tsulaia, Vakhtang, Van Gemmeren, Peter, Wenaus, Torre, Yang, Wei
Lenguaje:eng
Publicado: 2018
Materias:
Acceso en línea:http://cds.cern.ch/record/2645080
_version_ 1780960426808311808
author Benjamin, Douglas
Calafiura, Paolo
Childers, John Taylor
De, Kaushik
Di Girolamo, Alessandro
Fullana Torregrosa, Esteban
Guan, Wen
Maeno, Tadashi
Magini, Nicolo
Nilsson, Paul
Oleynik, Danila
Sun, Shaojun
Tsulaia, Vakhtang
Van Gemmeren, Peter
Wenaus, Torre
Yang, Wei
author_facet Benjamin, Douglas
Calafiura, Paolo
Childers, John Taylor
De, Kaushik
Di Girolamo, Alessandro
Fullana Torregrosa, Esteban
Guan, Wen
Maeno, Tadashi
Magini, Nicolo
Nilsson, Paul
Oleynik, Danila
Sun, Shaojun
Tsulaia, Vakhtang
Van Gemmeren, Peter
Wenaus, Torre
Yang, Wei
author_sort Benjamin, Douglas
collection CERN
description During LHC's Run-2 ATLAS has been developing and evaluating new fine-grained approaches to workflows and dataflows able to better utilize computing resources in terms of storage, processing and networks. The compute-limited physics of ATLAS has driven the collaboration to aggressively harvest opportunistic cycles from what are often transiently available resources, including HPCs, clouds, volunteer computing, and grid resources in transitional states. Fine-grained processing (with typically a few minutes’ granularity, corresponding to one event for the present ATLAS full simulation) enables agile workflows with a light footprint on the resource such that cycles can be more fully and efficiently utilized than with conventional workflows processing O(GB) files per job. The workflow component of this approach, the ATLAS Event Service, is currently in production on some grid sites and on several supercomputing sites. The Event Service architecture allows real-time delivery of fine-grained workloads to payload applications running on compute nodes. The outputs produced by the payload applications are immediately streamed out into a secure location, such that Event Service jobs can be terminated practically at any time with minimal data losses. On HPCs the architecture gives us the flexibility to dynamically vary the size of submitted jobs from several up to thousands of concurrent nodes and the duration of jobs from less than an hour (backfill jobs) to multiple hours, thus maximizing the utilization of the machine by ensuring every processing unit remains productively occupied. The architecture is an HPC-internal, MPI-based version of the highly scalable global workload management system of ATLAS which presently manages up to 1.2 million concurrent processors around the clock. This makes it a proven scalable candidate for exascale computing, which is expected to be an important element of LHC Run-3 computing from 2021 and HL-LHC from 2026. Today the R&D attention of the development of the fine-grained processing system is shifting to the data flow component, the Event Streaming Service (ESS). The ESS approach fits naturally into 'data lake' conceptions of the HL-LHC computing ecosystem in which data is served to consumers via CDN-like streaming services mediating interactions between a hierarchical, distributed storage federation and a client requesting (just) the data it needs, enabling efficient and customized data transfer which will be critical for IO intensive workloads, and minimizing costly disk-resident replicas. With disk storage expected to be the costliest component of HL-LHC computing, the development costs of building such an infrastructure should pay off in the efficiencies to be gained. This presentation will describe the present state and future plans for this fine-grained processing development program, targeted ultimately at the ATLAS HL-LHC program.
id cern-2645080
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2018
record_format invenio
spelling cern-26450802019-09-30T06:29:59Zhttp://cds.cern.ch/record/2645080engBenjamin, DouglasCalafiura, PaoloChilders, John TaylorDe, KaushikDi Girolamo, AlessandroFullana Torregrosa, EstebanGuan, WenMaeno, TadashiMagini, NicoloNilsson, PaulOleynik, DanilaSun, ShaojunTsulaia, VakhtangVan Gemmeren, PeterWenaus, TorreYang, WeiFine-grained processing towards HL-LHC computing in ATLASParticle Physics - ExperimentDuring LHC's Run-2 ATLAS has been developing and evaluating new fine-grained approaches to workflows and dataflows able to better utilize computing resources in terms of storage, processing and networks. The compute-limited physics of ATLAS has driven the collaboration to aggressively harvest opportunistic cycles from what are often transiently available resources, including HPCs, clouds, volunteer computing, and grid resources in transitional states. Fine-grained processing (with typically a few minutes’ granularity, corresponding to one event for the present ATLAS full simulation) enables agile workflows with a light footprint on the resource such that cycles can be more fully and efficiently utilized than with conventional workflows processing O(GB) files per job. The workflow component of this approach, the ATLAS Event Service, is currently in production on some grid sites and on several supercomputing sites. The Event Service architecture allows real-time delivery of fine-grained workloads to payload applications running on compute nodes. The outputs produced by the payload applications are immediately streamed out into a secure location, such that Event Service jobs can be terminated practically at any time with minimal data losses. On HPCs the architecture gives us the flexibility to dynamically vary the size of submitted jobs from several up to thousands of concurrent nodes and the duration of jobs from less than an hour (backfill jobs) to multiple hours, thus maximizing the utilization of the machine by ensuring every processing unit remains productively occupied. The architecture is an HPC-internal, MPI-based version of the highly scalable global workload management system of ATLAS which presently manages up to 1.2 million concurrent processors around the clock. This makes it a proven scalable candidate for exascale computing, which is expected to be an important element of LHC Run-3 computing from 2021 and HL-LHC from 2026. Today the R&D attention of the development of the fine-grained processing system is shifting to the data flow component, the Event Streaming Service (ESS). The ESS approach fits naturally into 'data lake' conceptions of the HL-LHC computing ecosystem in which data is served to consumers via CDN-like streaming services mediating interactions between a hierarchical, distributed storage federation and a client requesting (just) the data it needs, enabling efficient and customized data transfer which will be critical for IO intensive workloads, and minimizing costly disk-resident replicas. With disk storage expected to be the costliest component of HL-LHC computing, the development costs of building such an infrastructure should pay off in the efficiencies to be gained. This presentation will describe the present state and future plans for this fine-grained processing development program, targeted ultimately at the ATLAS HL-LHC program.ATL-SOFT-SLIDE-2018-981oai:cds.cern.ch:26450802018-10-26
spellingShingle Particle Physics - Experiment
Benjamin, Douglas
Calafiura, Paolo
Childers, John Taylor
De, Kaushik
Di Girolamo, Alessandro
Fullana Torregrosa, Esteban
Guan, Wen
Maeno, Tadashi
Magini, Nicolo
Nilsson, Paul
Oleynik, Danila
Sun, Shaojun
Tsulaia, Vakhtang
Van Gemmeren, Peter
Wenaus, Torre
Yang, Wei
Fine-grained processing towards HL-LHC computing in ATLAS
title Fine-grained processing towards HL-LHC computing in ATLAS
title_full Fine-grained processing towards HL-LHC computing in ATLAS
title_fullStr Fine-grained processing towards HL-LHC computing in ATLAS
title_full_unstemmed Fine-grained processing towards HL-LHC computing in ATLAS
title_short Fine-grained processing towards HL-LHC computing in ATLAS
title_sort fine-grained processing towards hl-lhc computing in atlas
topic Particle Physics - Experiment
url http://cds.cern.ch/record/2645080
work_keys_str_mv AT benjamindouglas finegrainedprocessingtowardshllhccomputinginatlas
AT calafiurapaolo finegrainedprocessingtowardshllhccomputinginatlas
AT childersjohntaylor finegrainedprocessingtowardshllhccomputinginatlas
AT dekaushik finegrainedprocessingtowardshllhccomputinginatlas
AT digirolamoalessandro finegrainedprocessingtowardshllhccomputinginatlas
AT fullanatorregrosaesteban finegrainedprocessingtowardshllhccomputinginatlas
AT guanwen finegrainedprocessingtowardshllhccomputinginatlas
AT maenotadashi finegrainedprocessingtowardshllhccomputinginatlas
AT magininicolo finegrainedprocessingtowardshllhccomputinginatlas
AT nilssonpaul finegrainedprocessingtowardshllhccomputinginatlas
AT oleynikdanila finegrainedprocessingtowardshllhccomputinginatlas
AT sunshaojun finegrainedprocessingtowardshllhccomputinginatlas
AT tsulaiavakhtang finegrainedprocessingtowardshllhccomputinginatlas
AT vangemmerenpeter finegrainedprocessingtowardshllhccomputinginatlas
AT wenaustorre finegrainedprocessingtowardshllhccomputinginatlas
AT yangwei finegrainedprocessingtowardshllhccomputinginatlas