Cargando…

Data streams processing in metadata integration system for HENP experiments

Nowadays, heterogeneous metadata integration has become a widespread objective. Whenever it is addressed, there are numerous tasks to be solved, such as data sources analysis and storage schema development. No less important one is the development of automated, configurable and highly manageable ETL...

Descripción completa

Detalles Bibliográficos
Autores principales: Kaida, Anastasiia, Golosova, Marina, Grigoryeva, Maria, Aulov, Vasilii
Lenguaje:eng
Publicado: 2019
Materias:
Acceso en línea:http://cds.cern.ch/record/2690991
_version_ 1780963801174114304
author Kaida, Anastasiia
Golosova, Marina
Grigoryeva, Maria
Aulov, Vasilii
author_facet Kaida, Anastasiia
Golosova, Marina
Grigoryeva, Maria
Aulov, Vasilii
author_sort Kaida, Anastasiia
collection CERN
description Nowadays, heterogeneous metadata integration has become a widespread objective. Whenever it is addressed, there are numerous tasks to be solved, such as data sources analysis and storage schema development. No less important one is the development of automated, configurable and highly manageable ETL (data Extraction, Transformation, and Load) processes, as well as the creation of tools for their automatization, scheduling, management, monitoring. This work describes the Metadata Integration and Topology Management System, initially designed as a subsystem of the Data Knowledge Base (DKB) developed for the ATLAS experiment. The core idea of the subsystem is to separate the common features of the majority of ETL-processes from the implementation of particular tasks. It is implemented as standalone modules: supervisor and workers; a supervisor is responsible for data streams building through workers that implement a set of specific operations for a particular process. The system is intended to considerably facilitate the organizing of ongoing data integration operations with automated data stream processing.
id cern-2690991
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2019
record_format invenio
spelling cern-26909912019-09-30T06:29:59Zhttp://cds.cern.ch/record/2690991engKaida, AnastasiiaGolosova, MarinaGrigoryeva, MariaAulov, VasiliiData streams processing in metadata integration system for HENP experimentsParticle Physics - ExperimentNowadays, heterogeneous metadata integration has become a widespread objective. Whenever it is addressed, there are numerous tasks to be solved, such as data sources analysis and storage schema development. No less important one is the development of automated, configurable and highly manageable ETL (data Extraction, Transformation, and Load) processes, as well as the creation of tools for their automatization, scheduling, management, monitoring. This work describes the Metadata Integration and Topology Management System, initially designed as a subsystem of the Data Knowledge Base (DKB) developed for the ATLAS experiment. The core idea of the subsystem is to separate the common features of the majority of ETL-processes from the implementation of particular tasks. It is implemented as standalone modules: supervisor and workers; a supervisor is responsible for data streams building through workers that implement a set of specific operations for a particular process. The system is intended to considerably facilitate the organizing of ongoing data integration operations with automated data stream processing.ATL-SOFT-SLIDE-2019-699oai:cds.cern.ch:26909912019-09-28
spellingShingle Particle Physics - Experiment
Kaida, Anastasiia
Golosova, Marina
Grigoryeva, Maria
Aulov, Vasilii
Data streams processing in metadata integration system for HENP experiments
title Data streams processing in metadata integration system for HENP experiments
title_full Data streams processing in metadata integration system for HENP experiments
title_fullStr Data streams processing in metadata integration system for HENP experiments
title_full_unstemmed Data streams processing in metadata integration system for HENP experiments
title_short Data streams processing in metadata integration system for HENP experiments
title_sort data streams processing in metadata integration system for henp experiments
topic Particle Physics - Experiment
url http://cds.cern.ch/record/2690991
work_keys_str_mv AT kaidaanastasiia datastreamsprocessinginmetadataintegrationsystemforhenpexperiments
AT golosovamarina datastreamsprocessinginmetadataintegrationsystemforhenpexperiments
AT grigoryevamaria datastreamsprocessinginmetadataintegrationsystemforhenpexperiments
AT aulovvasilii datastreamsprocessinginmetadataintegrationsystemforhenpexperiments