Cargando…
Reclustering of high energy physics data
The coming high energy physics experiments will store Petabytes of data into object databases. Analysis jobs will frequently traverse collections containing millions of stored objects. Clustering is one of the most effective means $9 to enhance the performance of these applications. The paper presen...
Autor principal: | |
---|---|
Lenguaje: | eng |
Publicado: |
1999
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/409785 |
_version_ | 1780894532880039936 |
---|---|
author | Schaller, M |
author_facet | Schaller, M |
author_sort | Schaller, M |
collection | CERN |
description | The coming high energy physics experiments will store Petabytes of data into object databases. Analysis jobs will frequently traverse collections containing millions of stored objects. Clustering is one of the most effective means $9 to enhance the performance of these applications. The paper presents a reclustering algorithm for independent objects contained in multiple possibly overlapping collections on secondary storage. The algorithm decomposes the stored $9 objects into a number of independent chunks and then maps these chunks to a traveling salesman problem. Under a set of realistic assumptions, the number of disk seeks is reduced almost to the theoretical minimum. Experimental results $9 obtained from a prototype are included. (17 refs). |
id | cern-409785 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 1999 |
record_format | invenio |
spelling | cern-4097852019-09-30T06:29:59Zhttp://cds.cern.ch/record/409785engSchaller, MReclustering of high energy physics dataComputing and ComputersThe coming high energy physics experiments will store Petabytes of data into object databases. Analysis jobs will frequently traverse collections containing millions of stored objects. Clustering is one of the most effective means $9 to enhance the performance of these applications. The paper presents a reclustering algorithm for independent objects contained in multiple possibly overlapping collections on secondary storage. The algorithm decomposes the stored $9 objects into a number of independent chunks and then maps these chunks to a traveling salesman problem. Under a set of realistic assumptions, the number of disk seeks is reduced almost to the theoretical minimum. Experimental results $9 obtained from a prototype are included. (17 refs).oai:cds.cern.ch:4097851999 |
spellingShingle | Computing and Computers Schaller, M Reclustering of high energy physics data |
title | Reclustering of high energy physics data |
title_full | Reclustering of high energy physics data |
title_fullStr | Reclustering of high energy physics data |
title_full_unstemmed | Reclustering of high energy physics data |
title_short | Reclustering of high energy physics data |
title_sort | reclustering of high energy physics data |
topic | Computing and Computers |
url | http://cds.cern.ch/record/409785 |
work_keys_str_mv | AT schallerm reclusteringofhighenergyphysicsdata |