Cargando…

Analysis of data integrity and storage quality of a distributed storage system

CERN uses the world’s largest scientific computing grid, WLCG, for distributed data storage and processing. Monitoring of the CPU and storage resources is an important and essential element to detect operational issues in its systems, for example in the storage elements, and to ensure their proper a...

Descripción completa

Detalles Bibliográficos
Autores principales: Negru, Adrian Eduard, Betev, Latchezar, Carabaș, Mihai, Grigoraș, Costin, Țăpuş, Nicolae, Weisz, Sergiu
Lenguaje:eng
Publicado: 2021
Materias:
Acceso en línea:https://dx.doi.org/10.1051/epjconf/202125102035
http://cds.cern.ch/record/2814357
_version_ 1780973441204092928
author Negru, Adrian Eduard
Betev, Latchezar
Carabaș, Mihai
Grigoraș, Costin
Țăpuş, Nicolae
Weisz, Sergiu
author_facet Negru, Adrian Eduard
Betev, Latchezar
Carabaș, Mihai
Grigoraș, Costin
Țăpuş, Nicolae
Weisz, Sergiu
author_sort Negru, Adrian Eduard
collection CERN
description CERN uses the world’s largest scientific computing grid, WLCG, for distributed data storage and processing. Monitoring of the CPU and storage resources is an important and essential element to detect operational issues in its systems, for example in the storage elements, and to ensure their proper and efficient function. The processing of experiment data depends strongly on the data access quality, as well as its integrity and both of these key parameters must be assured for the data lifetime. Given the substantial amount of data, O(200 PB), already collected by ALICE and kept at various storage elements around the globe, scanning every single data chunk would be a very expensive process, both in terms of computing resources usage and in terms of execution time. In this paper, we describe a distributed file crawler that addresses these natural limits by periodically extracting and analyzing statistically significant samples of files from storage elements, evaluates the results and is integrated with the existing monitoring solution, MonALISA.
id cern-2814357
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2021
record_format invenio
spelling cern-28143572022-07-25T15:28:23Zdoi:10.1051/epjconf/202125102035http://cds.cern.ch/record/2814357engNegru, Adrian EduardBetev, LatchezarCarabaș, MihaiGrigoraș, CostinȚăpuş, NicolaeWeisz, SergiuAnalysis of data integrity and storage quality of a distributed storage systemComputing and ComputersCERN uses the world’s largest scientific computing grid, WLCG, for distributed data storage and processing. Monitoring of the CPU and storage resources is an important and essential element to detect operational issues in its systems, for example in the storage elements, and to ensure their proper and efficient function. The processing of experiment data depends strongly on the data access quality, as well as its integrity and both of these key parameters must be assured for the data lifetime. Given the substantial amount of data, O(200 PB), already collected by ALICE and kept at various storage elements around the globe, scanning every single data chunk would be a very expensive process, both in terms of computing resources usage and in terms of execution time. In this paper, we describe a distributed file crawler that addresses these natural limits by periodically extracting and analyzing statistically significant samples of files from storage elements, evaluates the results and is integrated with the existing monitoring solution, MonALISA.oai:cds.cern.ch:28143572021
spellingShingle Computing and Computers
Negru, Adrian Eduard
Betev, Latchezar
Carabaș, Mihai
Grigoraș, Costin
Țăpuş, Nicolae
Weisz, Sergiu
Analysis of data integrity and storage quality of a distributed storage system
title Analysis of data integrity and storage quality of a distributed storage system
title_full Analysis of data integrity and storage quality of a distributed storage system
title_fullStr Analysis of data integrity and storage quality of a distributed storage system
title_full_unstemmed Analysis of data integrity and storage quality of a distributed storage system
title_short Analysis of data integrity and storage quality of a distributed storage system
title_sort analysis of data integrity and storage quality of a distributed storage system
topic Computing and Computers
url https://dx.doi.org/10.1051/epjconf/202125102035
http://cds.cern.ch/record/2814357
work_keys_str_mv AT negruadrianeduard analysisofdataintegrityandstoragequalityofadistributedstoragesystem
AT betevlatchezar analysisofdataintegrityandstoragequalityofadistributedstoragesystem
AT carabasmihai analysisofdataintegrityandstoragequalityofadistributedstoragesystem
AT grigorascostin analysisofdataintegrityandstoragequalityofadistributedstoragesystem
AT tapusnicolae analysisofdataintegrityandstoragequalityofadistributedstoragesystem
AT weiszsergiu analysisofdataintegrityandstoragequalityofadistributedstoragesystem