Cargando…

Getting the Most from Distributed Resources With an Analytics Platform for ATLAS Computing Services

To meet a sharply increasing demand for computing resources for LHC Run 2, ATLAS distributed computing systems reach far and wide to gather CPU resources and storage capacity to execute an evolving ecosystem of production and analysis workflow tools. Indeed more than a hundred computing sites from t...

Descripción completa

Detalles Bibliográficos
Autores principales: Vukotic, Ilija, Gardner, Robert, Bryant, Lincoln
Lenguaje:eng
Publicado: 2016
Materias:
Acceso en línea:https://dx.doi.org/10.22323/1.282.0192
http://cds.cern.ch/record/2231055
_version_ 1780952625546526720
author Vukotic, Ilija
Gardner, Robert
Bryant, Lincoln
author_facet Vukotic, Ilija
Gardner, Robert
Bryant, Lincoln
author_sort Vukotic, Ilija
collection CERN
description To meet a sharply increasing demand for computing resources for LHC Run 2, ATLAS distributed computing systems reach far and wide to gather CPU resources and storage capacity to execute an evolving ecosystem of production and analysis workflow tools. Indeed more than a hundred computing sites from the Worldwide LHC Computing Grid, plus many “opportunistic” facilities at HPC centers, universities, national laboratories, and public clouds, combine to meet these requirements. These resources have characteristics (such as local queuing availability, proximity to data sources and target destinations, network latency and bandwidth capacity, etc.) affecting the overall processing efficiency and throughput. To quantitatively understand and in some instances predict behavior, we have developed a platform to aggregate, index (for user queries), and analyze the more important information streams affecting performance. These data streams come from the ATLAS production system (PanDA), the distributed data management system (Rucio), the network (throughput and latency measurements, aggregate link traffic), and from the computing facilities themselves. The platform brings new capabilities to the management of the overall system, including warehousing information, an interface to execute arbitrary data mining and machine learning algorithms over aggregated datasets, a platform to test usage scenarios, and a portal for user-designed analytics dashboards.
id cern-2231055
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2016
record_format invenio
spelling cern-22310552019-09-30T06:29:59Zdoi:10.22323/1.282.0192http://cds.cern.ch/record/2231055engVukotic, IlijaGardner, RobertBryant, LincolnGetting the Most from Distributed Resources With an Analytics Platform for ATLAS Computing ServicesParticle Physics - ExperimentTo meet a sharply increasing demand for computing resources for LHC Run 2, ATLAS distributed computing systems reach far and wide to gather CPU resources and storage capacity to execute an evolving ecosystem of production and analysis workflow tools. Indeed more than a hundred computing sites from the Worldwide LHC Computing Grid, plus many “opportunistic” facilities at HPC centers, universities, national laboratories, and public clouds, combine to meet these requirements. These resources have characteristics (such as local queuing availability, proximity to data sources and target destinations, network latency and bandwidth capacity, etc.) affecting the overall processing efficiency and throughput. To quantitatively understand and in some instances predict behavior, we have developed a platform to aggregate, index (for user queries), and analyze the more important information streams affecting performance. These data streams come from the ATLAS production system (PanDA), the distributed data management system (Rucio), the network (throughput and latency measurements, aggregate link traffic), and from the computing facilities themselves. The platform brings new capabilities to the management of the overall system, including warehousing information, an interface to execute arbitrary data mining and machine learning algorithms over aggregated datasets, a platform to test usage scenarios, and a portal for user-designed analytics dashboards.ATL-SOFT-PROC-2016-009oai:cds.cern.ch:22310552016-11-08
spellingShingle Particle Physics - Experiment
Vukotic, Ilija
Gardner, Robert
Bryant, Lincoln
Getting the Most from Distributed Resources With an Analytics Platform for ATLAS Computing Services
title Getting the Most from Distributed Resources With an Analytics Platform for ATLAS Computing Services
title_full Getting the Most from Distributed Resources With an Analytics Platform for ATLAS Computing Services
title_fullStr Getting the Most from Distributed Resources With an Analytics Platform for ATLAS Computing Services
title_full_unstemmed Getting the Most from Distributed Resources With an Analytics Platform for ATLAS Computing Services
title_short Getting the Most from Distributed Resources With an Analytics Platform for ATLAS Computing Services
title_sort getting the most from distributed resources with an analytics platform for atlas computing services
topic Particle Physics - Experiment
url https://dx.doi.org/10.22323/1.282.0192
http://cds.cern.ch/record/2231055
work_keys_str_mv AT vukoticilija gettingthemostfromdistributedresourceswithananalyticsplatformforatlascomputingservices
AT gardnerrobert gettingthemostfromdistributedresourceswithananalyticsplatformforatlascomputingservices
AT bryantlincoln gettingthemostfromdistributedresourceswithananalyticsplatformforatlascomputingservices