Cargando…
Operation of the ATLAS distributed computing
We describe the central operation of the ATLAS distributed computing system. The majority of compute intensive activities within ATLAS are carried out on some 350,000 CPU cores on the Grid, augmented by opportunistic usage of significant HPC and volunteer resources. The increasing scale, and challen...
Autores principales: | , , , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2018
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2626049 |
Sumario: | We describe the central operation of the ATLAS distributed computing system. The majority of compute intensive activities within ATLAS are carried out on some 350,000 CPU cores on the Grid, augmented by opportunistic usage of significant HPC and volunteer resources. The increasing scale, and challenging new payloads, demand fine-tuning of operational procedures together with timely developments of the production system. We describe several such developments, motivated directly from operational experience. Optimization of inefficient task requests, from both official production and users, is made possible by automatic detection of payload properties. User education, job shaping or preventative throttling help to increase the overall throughput of the available resources. |
---|