Cargando…

Persistent ATLAS Data Structures and Reclustering of Event Data

The ATLAS experiment will start to take data in the year 2005. The amount of experimental data forms a serious challenge for data processing and data storage. About 1 PB (1015 bytes) per year has to be processed and stored. Currently, a paradigm shift in High-Energy Physics (HEP) computing is taking...

Descripción completa

Detalles Bibliográficos
Autor principal: Schaller, Martin
Lenguaje:eng
Publicado: U. 1999
Materias:
Acceso en línea:http://cds.cern.ch/record/1388256
Descripción
Sumario:The ATLAS experiment will start to take data in the year 2005. The amount of experimental data forms a serious challenge for data processing and data storage. About 1 PB (1015 bytes) per year has to be processed and stored. Currently, a paradigm shift in High-Energy Physics (HEP) computing is taking place. It is planned that software is written in object-oriented languages (mainly C++). For data storage the usage of object-oriented database management systems (ODBMSs) is foreseen. This thesis investigates the usage of an ODBMS in the ATLAS experiment. Work was done in several connected areas. First, we present exhaustive benchmarks of the commercial ODBMS Objectivity/DB that is today the most promising candidate for the storage system. We describe the ATLAS 1 TB milestone that was performed to investigate the reliability and performance of an ODBMS storage solution coupled to a mass storage system. Second, we report about the design and implementation of the persistent ATLAS data structures, both in the detector description and event domain. We describe the implementation of the AMDB (ATLAS Muon Database), the design of the raw event model and the implementation of several event collection classes. The most important result and the main novel contribution of this thesis is a new reclustering algorithm for the event data. Clustering describes the object placement on disk. It is one of the most eective performance enhancement techniques for ODBMSs. We describe a reclustering algorithm for objects contained in several (possible overlapping) collections. The algorithm works by decomposing the collections into a set of non-overlapping atomic regions. The objects within each of these ner subsets are stored together; thus the issue is how to order the subsets. The problem is mapped to a weighted graph resulting in an instance of the traveling salesman problem. A standard heuristic can be chosen to nd an approximate solution. We can show that under a set of realistic and natural assumptions the algorithm reduces the number of disk seeks almost to the theoretical lower limit. We describe the design and implementation of a prototype. The reclustering algorithm is qualitatively and quantitatively analyzed. The experimental results are presented and compared with the theoretical model.