Cargando…
A Monitoring System for the New ALICE O2 Farm
The ALICE Experiment has been designed to study the physics of strongly interacting matter with heavy-ion collisions at the CERN LHC. A major upgrade of the detector and computing model (O2, Offline-Online) is currently ongoing. The ALICE O2 farm will consist of almost 1000 nodes enabled to readout...
Autores principales: | , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.18429/JACoW-ICALEPCS2019-TUDPP01 http://cds.cern.ch/record/2777808 |
_version_ | 1780971706471415808 |
---|---|
author | Vino, Gioacchino Chibante Barroso, Vasco Elia, Domenico Wegrzynek, Adam |
author_facet | Vino, Gioacchino Chibante Barroso, Vasco Elia, Domenico Wegrzynek, Adam |
author_sort | Vino, Gioacchino |
collection | CERN |
description | The ALICE Experiment has been designed to study the physics of strongly interacting matter with heavy-ion collisions at the CERN LHC. A major upgrade of the detector and computing model (O2, Offline-Online) is currently ongoing. The ALICE O2 farm will consist of almost 1000 nodes enabled to readout and process on-the-fly about 27 Tb/s of raw data. To increase the efficiency of computing farm operations a general-purpose near real-time monitoring system has been developed: it lays on features like high-performance, high-availability, modularity, and open source. The core component (Apache Kafka) ensures high throughput, data pipelines, and fault-tolerant services. Additional monitoring functionality is based on Telegraf as metric collector, Apache Spark for complex aggregation, InfluxDB as time-series database, and Grafana as visualization tool. A logging service based on Elasticsearch stack is also included. The designed system handles metrics coming from operating system, network, custom hardware, and in-house software. A prototype version is currently running at CERN and has been also successfully deployed by the ReCaS Datacenter at INFN Bari for both monitoring and logging. |
id | cern-2777808 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2020 |
record_format | invenio |
spelling | cern-27778082022-01-14T14:55:03Zdoi:10.18429/JACoW-ICALEPCS2019-TUDPP01http://cds.cern.ch/record/2777808engVino, GioacchinoChibante Barroso, VascoElia, DomenicoWegrzynek, AdamA Monitoring System for the New ALICE O2 FarmDetectors and Experimental TechniquesAccelerators and Storage RingsThe ALICE Experiment has been designed to study the physics of strongly interacting matter with heavy-ion collisions at the CERN LHC. A major upgrade of the detector and computing model (O2, Offline-Online) is currently ongoing. The ALICE O2 farm will consist of almost 1000 nodes enabled to readout and process on-the-fly about 27 Tb/s of raw data. To increase the efficiency of computing farm operations a general-purpose near real-time monitoring system has been developed: it lays on features like high-performance, high-availability, modularity, and open source. The core component (Apache Kafka) ensures high throughput, data pipelines, and fault-tolerant services. Additional monitoring functionality is based on Telegraf as metric collector, Apache Spark for complex aggregation, InfluxDB as time-series database, and Grafana as visualization tool. A logging service based on Elasticsearch stack is also included. The designed system handles metrics coming from operating system, network, custom hardware, and in-house software. A prototype version is currently running at CERN and has been also successfully deployed by the ReCaS Datacenter at INFN Bari for both monitoring and logging.oai:cds.cern.ch:27778082020 |
spellingShingle | Detectors and Experimental Techniques Accelerators and Storage Rings Vino, Gioacchino Chibante Barroso, Vasco Elia, Domenico Wegrzynek, Adam A Monitoring System for the New ALICE O2 Farm |
title | A Monitoring System for the New ALICE O2 Farm |
title_full | A Monitoring System for the New ALICE O2 Farm |
title_fullStr | A Monitoring System for the New ALICE O2 Farm |
title_full_unstemmed | A Monitoring System for the New ALICE O2 Farm |
title_short | A Monitoring System for the New ALICE O2 Farm |
title_sort | monitoring system for the new alice o2 farm |
topic | Detectors and Experimental Techniques Accelerators and Storage Rings |
url | https://dx.doi.org/10.18429/JACoW-ICALEPCS2019-TUDPP01 http://cds.cern.ch/record/2777808 |
work_keys_str_mv | AT vinogioacchino amonitoringsystemforthenewaliceo2farm AT chibantebarrosovasco amonitoringsystemforthenewaliceo2farm AT eliadomenico amonitoringsystemforthenewaliceo2farm AT wegrzynekadam amonitoringsystemforthenewaliceo2farm AT vinogioacchino monitoringsystemforthenewaliceo2farm AT chibantebarrosovasco monitoringsystemforthenewaliceo2farm AT eliadomenico monitoringsystemforthenewaliceo2farm AT wegrzynekadam monitoringsystemforthenewaliceo2farm |