Cargando…

A Monitoring System for the New ALICE O2 Farm

The ALICE Experiment has been designed to study the physics of strongly interacting matter with heavy-ion collisions at the CERN LHC. A major upgrade of the detector and computing model (O2, Offline-Online) is currently ongoing. The ALICE O2 farm will consist of almost 1000 nodes enabled to readout...

Descripción completa

Detalles Bibliográficos
Autores principales: Vino, Gioacchino, Chibante Barroso, Vasco, Elia, Domenico, Wegrzynek, Adam
Lenguaje:eng
Publicado: 2020
Materias:
Acceso en línea:https://dx.doi.org/10.18429/JACoW-ICALEPCS2019-TUDPP01
http://cds.cern.ch/record/2777808
_version_ 1780971706471415808
author Vino, Gioacchino
Chibante Barroso, Vasco
Elia, Domenico
Wegrzynek, Adam
author_facet Vino, Gioacchino
Chibante Barroso, Vasco
Elia, Domenico
Wegrzynek, Adam
author_sort Vino, Gioacchino
collection CERN
description The ALICE Experiment has been designed to study the physics of strongly interacting matter with heavy-ion collisions at the CERN LHC. A major upgrade of the detector and computing model (O2, Offline-Online) is currently ongoing. The ALICE O2 farm will consist of almost 1000 nodes enabled to readout and process on-the-fly about 27 Tb/s of raw data. To increase the efficiency of computing farm operations a general-purpose near real-time monitoring system has been developed: it lays on features like high-performance, high-availability, modularity, and open source. The core component (Apache Kafka) ensures high throughput, data pipelines, and fault-tolerant services. Additional monitoring functionality is based on Telegraf as metric collector, Apache Spark for complex aggregation, InfluxDB as time-series database, and Grafana as visualization tool. A logging service based on Elasticsearch stack is also included. The designed system handles metrics coming from operating system, network, custom hardware, and in-house software. A prototype version is currently running at CERN and has been also successfully deployed by the ReCaS Datacenter at INFN Bari for both monitoring and logging.
id cern-2777808
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2020
record_format invenio
spelling cern-27778082022-01-14T14:55:03Zdoi:10.18429/JACoW-ICALEPCS2019-TUDPP01http://cds.cern.ch/record/2777808engVino, GioacchinoChibante Barroso, VascoElia, DomenicoWegrzynek, AdamA Monitoring System for the New ALICE O2 FarmDetectors and Experimental TechniquesAccelerators and Storage RingsThe ALICE Experiment has been designed to study the physics of strongly interacting matter with heavy-ion collisions at the CERN LHC. A major upgrade of the detector and computing model (O2, Offline-Online) is currently ongoing. The ALICE O2 farm will consist of almost 1000 nodes enabled to readout and process on-the-fly about 27 Tb/s of raw data. To increase the efficiency of computing farm operations a general-purpose near real-time monitoring system has been developed: it lays on features like high-performance, high-availability, modularity, and open source. The core component (Apache Kafka) ensures high throughput, data pipelines, and fault-tolerant services. Additional monitoring functionality is based on Telegraf as metric collector, Apache Spark for complex aggregation, InfluxDB as time-series database, and Grafana as visualization tool. A logging service based on Elasticsearch stack is also included. The designed system handles metrics coming from operating system, network, custom hardware, and in-house software. A prototype version is currently running at CERN and has been also successfully deployed by the ReCaS Datacenter at INFN Bari for both monitoring and logging.oai:cds.cern.ch:27778082020
spellingShingle Detectors and Experimental Techniques
Accelerators and Storage Rings
Vino, Gioacchino
Chibante Barroso, Vasco
Elia, Domenico
Wegrzynek, Adam
A Monitoring System for the New ALICE O2 Farm
title A Monitoring System for the New ALICE O2 Farm
title_full A Monitoring System for the New ALICE O2 Farm
title_fullStr A Monitoring System for the New ALICE O2 Farm
title_full_unstemmed A Monitoring System for the New ALICE O2 Farm
title_short A Monitoring System for the New ALICE O2 Farm
title_sort monitoring system for the new alice o2 farm
topic Detectors and Experimental Techniques
Accelerators and Storage Rings
url https://dx.doi.org/10.18429/JACoW-ICALEPCS2019-TUDPP01
http://cds.cern.ch/record/2777808
work_keys_str_mv AT vinogioacchino amonitoringsystemforthenewaliceo2farm
AT chibantebarrosovasco amonitoringsystemforthenewaliceo2farm
AT eliadomenico amonitoringsystemforthenewaliceo2farm
AT wegrzynekadam amonitoringsystemforthenewaliceo2farm
AT vinogioacchino monitoringsystemforthenewaliceo2farm
AT chibantebarrosovasco monitoringsystemforthenewaliceo2farm
AT eliadomenico monitoringsystemforthenewaliceo2farm
AT wegrzynekadam monitoringsystemforthenewaliceo2farm