Cargando…
Fine-grained processing towards HL-LHC computing in ATLAS
During LHC's Run-2 ATLAS has been developing and evaluating new fine-grained approaches to workflows and dataflows able to better utilize computing resources in terms of storage, processing and networks. The compute-limited physics of ATLAS has driven the collaboration to aggressively harvest o...
Autores principales: | , , , , , , , , , , , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2018
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2645080 |
_version_ | 1780960426808311808 |
---|---|
author | Benjamin, Douglas Calafiura, Paolo Childers, John Taylor De, Kaushik Di Girolamo, Alessandro Fullana Torregrosa, Esteban Guan, Wen Maeno, Tadashi Magini, Nicolo Nilsson, Paul Oleynik, Danila Sun, Shaojun Tsulaia, Vakhtang Van Gemmeren, Peter Wenaus, Torre Yang, Wei |
author_facet | Benjamin, Douglas Calafiura, Paolo Childers, John Taylor De, Kaushik Di Girolamo, Alessandro Fullana Torregrosa, Esteban Guan, Wen Maeno, Tadashi Magini, Nicolo Nilsson, Paul Oleynik, Danila Sun, Shaojun Tsulaia, Vakhtang Van Gemmeren, Peter Wenaus, Torre Yang, Wei |
author_sort | Benjamin, Douglas |
collection | CERN |
description | During LHC's Run-2 ATLAS has been developing and evaluating new fine-grained approaches to workflows and dataflows able to better utilize computing resources in terms of storage, processing and networks. The compute-limited physics of ATLAS has driven the collaboration to aggressively harvest opportunistic cycles from what are often transiently available resources, including HPCs, clouds, volunteer computing, and grid resources in transitional states. Fine-grained processing (with typically a few minutes’ granularity, corresponding to one event for the present ATLAS full simulation) enables agile workflows with a light footprint on the resource such that cycles can be more fully and efficiently utilized than with conventional workflows processing O(GB) files per job. The workflow component of this approach, the ATLAS Event Service, is currently in production on some grid sites and on several supercomputing sites. The Event Service architecture allows real-time delivery of fine-grained workloads to payload applications running on compute nodes. The outputs produced by the payload applications are immediately streamed out into a secure location, such that Event Service jobs can be terminated practically at any time with minimal data losses. On HPCs the architecture gives us the flexibility to dynamically vary the size of submitted jobs from several up to thousands of concurrent nodes and the duration of jobs from less than an hour (backfill jobs) to multiple hours, thus maximizing the utilization of the machine by ensuring every processing unit remains productively occupied. The architecture is an HPC-internal, MPI-based version of the highly scalable global workload management system of ATLAS which presently manages up to 1.2 million concurrent processors around the clock. This makes it a proven scalable candidate for exascale computing, which is expected to be an important element of LHC Run-3 computing from 2021 and HL-LHC from 2026. Today the R&D attention of the development of the fine-grained processing system is shifting to the data flow component, the Event Streaming Service (ESS). The ESS approach fits naturally into 'data lake' conceptions of the HL-LHC computing ecosystem in which data is served to consumers via CDN-like streaming services mediating interactions between a hierarchical, distributed storage federation and a client requesting (just) the data it needs, enabling efficient and customized data transfer which will be critical for IO intensive workloads, and minimizing costly disk-resident replicas. With disk storage expected to be the costliest component of HL-LHC computing, the development costs of building such an infrastructure should pay off in the efficiencies to be gained. This presentation will describe the present state and future plans for this fine-grained processing development program, targeted ultimately at the ATLAS HL-LHC program. |
id | cern-2645080 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2018 |
record_format | invenio |
spelling | cern-26450802019-09-30T06:29:59Zhttp://cds.cern.ch/record/2645080engBenjamin, DouglasCalafiura, PaoloChilders, John TaylorDe, KaushikDi Girolamo, AlessandroFullana Torregrosa, EstebanGuan, WenMaeno, TadashiMagini, NicoloNilsson, PaulOleynik, DanilaSun, ShaojunTsulaia, VakhtangVan Gemmeren, PeterWenaus, TorreYang, WeiFine-grained processing towards HL-LHC computing in ATLASParticle Physics - ExperimentDuring LHC's Run-2 ATLAS has been developing and evaluating new fine-grained approaches to workflows and dataflows able to better utilize computing resources in terms of storage, processing and networks. The compute-limited physics of ATLAS has driven the collaboration to aggressively harvest opportunistic cycles from what are often transiently available resources, including HPCs, clouds, volunteer computing, and grid resources in transitional states. Fine-grained processing (with typically a few minutes’ granularity, corresponding to one event for the present ATLAS full simulation) enables agile workflows with a light footprint on the resource such that cycles can be more fully and efficiently utilized than with conventional workflows processing O(GB) files per job. The workflow component of this approach, the ATLAS Event Service, is currently in production on some grid sites and on several supercomputing sites. The Event Service architecture allows real-time delivery of fine-grained workloads to payload applications running on compute nodes. The outputs produced by the payload applications are immediately streamed out into a secure location, such that Event Service jobs can be terminated practically at any time with minimal data losses. On HPCs the architecture gives us the flexibility to dynamically vary the size of submitted jobs from several up to thousands of concurrent nodes and the duration of jobs from less than an hour (backfill jobs) to multiple hours, thus maximizing the utilization of the machine by ensuring every processing unit remains productively occupied. The architecture is an HPC-internal, MPI-based version of the highly scalable global workload management system of ATLAS which presently manages up to 1.2 million concurrent processors around the clock. This makes it a proven scalable candidate for exascale computing, which is expected to be an important element of LHC Run-3 computing from 2021 and HL-LHC from 2026. Today the R&D attention of the development of the fine-grained processing system is shifting to the data flow component, the Event Streaming Service (ESS). The ESS approach fits naturally into 'data lake' conceptions of the HL-LHC computing ecosystem in which data is served to consumers via CDN-like streaming services mediating interactions between a hierarchical, distributed storage federation and a client requesting (just) the data it needs, enabling efficient and customized data transfer which will be critical for IO intensive workloads, and minimizing costly disk-resident replicas. With disk storage expected to be the costliest component of HL-LHC computing, the development costs of building such an infrastructure should pay off in the efficiencies to be gained. This presentation will describe the present state and future plans for this fine-grained processing development program, targeted ultimately at the ATLAS HL-LHC program.ATL-SOFT-SLIDE-2018-981oai:cds.cern.ch:26450802018-10-26 |
spellingShingle | Particle Physics - Experiment Benjamin, Douglas Calafiura, Paolo Childers, John Taylor De, Kaushik Di Girolamo, Alessandro Fullana Torregrosa, Esteban Guan, Wen Maeno, Tadashi Magini, Nicolo Nilsson, Paul Oleynik, Danila Sun, Shaojun Tsulaia, Vakhtang Van Gemmeren, Peter Wenaus, Torre Yang, Wei Fine-grained processing towards HL-LHC computing in ATLAS |
title | Fine-grained processing towards HL-LHC computing in ATLAS |
title_full | Fine-grained processing towards HL-LHC computing in ATLAS |
title_fullStr | Fine-grained processing towards HL-LHC computing in ATLAS |
title_full_unstemmed | Fine-grained processing towards HL-LHC computing in ATLAS |
title_short | Fine-grained processing towards HL-LHC computing in ATLAS |
title_sort | fine-grained processing towards hl-lhc computing in atlas |
topic | Particle Physics - Experiment |
url | http://cds.cern.ch/record/2645080 |
work_keys_str_mv | AT benjamindouglas finegrainedprocessingtowardshllhccomputinginatlas AT calafiurapaolo finegrainedprocessingtowardshllhccomputinginatlas AT childersjohntaylor finegrainedprocessingtowardshllhccomputinginatlas AT dekaushik finegrainedprocessingtowardshllhccomputinginatlas AT digirolamoalessandro finegrainedprocessingtowardshllhccomputinginatlas AT fullanatorregrosaesteban finegrainedprocessingtowardshllhccomputinginatlas AT guanwen finegrainedprocessingtowardshllhccomputinginatlas AT maenotadashi finegrainedprocessingtowardshllhccomputinginatlas AT magininicolo finegrainedprocessingtowardshllhccomputinginatlas AT nilssonpaul finegrainedprocessingtowardshllhccomputinginatlas AT oleynikdanila finegrainedprocessingtowardshllhccomputinginatlas AT sunshaojun finegrainedprocessingtowardshllhccomputinginatlas AT tsulaiavakhtang finegrainedprocessingtowardshllhccomputinginatlas AT vangemmerenpeter finegrainedprocessingtowardshllhccomputinginatlas AT wenaustorre finegrainedprocessingtowardshllhccomputinginatlas AT yangwei finegrainedprocessingtowardshllhccomputinginatlas |