Cargando…

The CMS Data Aggregation System

Metadata plays a significant role in large modern enterprises, research experiments and digital libraries where it comes from many different sources and is distributed in a variety of digital formats. It is organized and managed by constantly evolving software using both relational and non-relat...

Descripción completa

Detalles Bibliográficos
Autores principales: Kuznetsov, Valentin, Evans, Dave, Metson, Simon
Lenguaje:eng
Publicado: 2010
Materias:
Acceso en línea:http://cds.cern.ch/record/1285520
_version_ 1780920571289141248
author Kuznetsov, Valentin
Evans, Dave
Metson, Simon
author_facet Kuznetsov, Valentin
Evans, Dave
Metson, Simon
author_sort Kuznetsov, Valentin
collection CERN
description Metadata plays a significant role in large modern enterprises, research experiments and digital libraries where it comes from many different sources and is distributed in a variety of digital formats. It is organized and managed by constantly evolving software using both relational and non-relational data sources. Even though we can apply an information retrieval approach to non-relation data sources, we can't do so for relational ones, where information is accessed via a pre-established set of data-services. Here we discuss a new data aggregation system which consumes, indexes and delivers information from different relational and non-relational data sources to answer cross data-service queries and explore metadata associated with petabytes of experimental data. We combine the simplicity of keyword-based search with the precision of RDMS under the new system. The aggregated information is collected from various sources, allowing end-users to place dynamic queries, get precise answers and trigger information retrieval on demand. Based on the use cases of the CMS experiment, we have performed a set of detailed, large scale tests the results of which we present in this paper.
id cern-1285520
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2010
record_format invenio
spelling cern-12855202019-09-30T06:29:59Zhttp://cds.cern.ch/record/1285520engKuznetsov, ValentinEvans, DaveMetson, SimonThe CMS Data Aggregation SystemDetectors and Experimental TechniquesMetadata plays a significant role in large modern enterprises, research experiments and digital libraries where it comes from many different sources and is distributed in a variety of digital formats. It is organized and managed by constantly evolving software using both relational and non-relational data sources. Even though we can apply an information retrieval approach to non-relation data sources, we can't do so for relational ones, where information is accessed via a pre-established set of data-services. Here we discuss a new data aggregation system which consumes, indexes and delivers information from different relational and non-relational data sources to answer cross data-service queries and explore metadata associated with petabytes of experimental data. We combine the simplicity of keyword-based search with the precision of RDMS under the new system. The aggregated information is collected from various sources, allowing end-users to place dynamic queries, get precise answers and trigger information retrieval on demand. Based on the use cases of the CMS experiment, we have performed a set of detailed, large scale tests the results of which we present in this paper.CMS-CR-2010-036oai:cds.cern.ch:12855202010-02-22
spellingShingle Detectors and Experimental Techniques
Kuznetsov, Valentin
Evans, Dave
Metson, Simon
The CMS Data Aggregation System
title The CMS Data Aggregation System
title_full The CMS Data Aggregation System
title_fullStr The CMS Data Aggregation System
title_full_unstemmed The CMS Data Aggregation System
title_short The CMS Data Aggregation System
title_sort cms data aggregation system
topic Detectors and Experimental Techniques
url http://cds.cern.ch/record/1285520
work_keys_str_mv AT kuznetsovvalentin thecmsdataaggregationsystem
AT evansdave thecmsdataaggregationsystem
AT metsonsimon thecmsdataaggregationsystem
AT kuznetsovvalentin cmsdataaggregationsystem
AT evansdave cmsdataaggregationsystem
AT metsonsimon cmsdataaggregationsystem