Cargando…
Analysis of data integrity and storage quality of a distributed storage system
CERN uses the world’s largest scientific computing grid, WLCG, for distributed data storage and processing. Monitoring of the CPU and storage resources is an important and essential element to detect operational issues in its systems, for example in the storage elements, and to ensure their proper a...
Autores principales: | , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1051/epjconf/202125102035 http://cds.cern.ch/record/2814357 |
_version_ | 1780973441204092928 |
---|---|
author | Negru, Adrian Eduard Betev, Latchezar Carabaș, Mihai Grigoraș, Costin Țăpuş, Nicolae Weisz, Sergiu |
author_facet | Negru, Adrian Eduard Betev, Latchezar Carabaș, Mihai Grigoraș, Costin Țăpuş, Nicolae Weisz, Sergiu |
author_sort | Negru, Adrian Eduard |
collection | CERN |
description | CERN uses the world’s largest scientific computing grid, WLCG, for distributed data storage and processing. Monitoring of the CPU and storage resources is an important and essential element to detect operational issues in its systems, for example in the storage elements, and to ensure their proper and efficient function. The processing of experiment data depends strongly on the data access quality, as well as its integrity and both of these key parameters must be assured for the data lifetime. Given the substantial amount of data, O(200 PB), already collected by ALICE and kept at various storage elements around the globe, scanning every single data chunk would be a very expensive process, both in terms of computing resources usage and in terms of execution time. In this paper, we describe a distributed file crawler that addresses these natural limits by periodically extracting and analyzing statistically significant samples of files from storage elements, evaluates the results and is integrated with the existing monitoring solution, MonALISA. |
id | cern-2814357 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2021 |
record_format | invenio |
spelling | cern-28143572022-07-25T15:28:23Zdoi:10.1051/epjconf/202125102035http://cds.cern.ch/record/2814357engNegru, Adrian EduardBetev, LatchezarCarabaș, MihaiGrigoraș, CostinȚăpuş, NicolaeWeisz, SergiuAnalysis of data integrity and storage quality of a distributed storage systemComputing and ComputersCERN uses the world’s largest scientific computing grid, WLCG, for distributed data storage and processing. Monitoring of the CPU and storage resources is an important and essential element to detect operational issues in its systems, for example in the storage elements, and to ensure their proper and efficient function. The processing of experiment data depends strongly on the data access quality, as well as its integrity and both of these key parameters must be assured for the data lifetime. Given the substantial amount of data, O(200 PB), already collected by ALICE and kept at various storage elements around the globe, scanning every single data chunk would be a very expensive process, both in terms of computing resources usage and in terms of execution time. In this paper, we describe a distributed file crawler that addresses these natural limits by periodically extracting and analyzing statistically significant samples of files from storage elements, evaluates the results and is integrated with the existing monitoring solution, MonALISA.oai:cds.cern.ch:28143572021 |
spellingShingle | Computing and Computers Negru, Adrian Eduard Betev, Latchezar Carabaș, Mihai Grigoraș, Costin Țăpuş, Nicolae Weisz, Sergiu Analysis of data integrity and storage quality of a distributed storage system |
title | Analysis of data integrity and storage quality of a distributed storage system |
title_full | Analysis of data integrity and storage quality of a distributed storage system |
title_fullStr | Analysis of data integrity and storage quality of a distributed storage system |
title_full_unstemmed | Analysis of data integrity and storage quality of a distributed storage system |
title_short | Analysis of data integrity and storage quality of a distributed storage system |
title_sort | analysis of data integrity and storage quality of a distributed storage system |
topic | Computing and Computers |
url | https://dx.doi.org/10.1051/epjconf/202125102035 http://cds.cern.ch/record/2814357 |
work_keys_str_mv | AT negruadrianeduard analysisofdataintegrityandstoragequalityofadistributedstoragesystem AT betevlatchezar analysisofdataintegrityandstoragequalityofadistributedstoragesystem AT carabasmihai analysisofdataintegrityandstoragequalityofadistributedstoragesystem AT grigorascostin analysisofdataintegrityandstoragequalityofadistributedstoragesystem AT tapusnicolae analysisofdataintegrityandstoragequalityofadistributedstoragesystem AT weiszsergiu analysisofdataintegrityandstoragequalityofadistributedstoragesystem |