Cargando…

Using Graph Databases

Data in HEP are usually stored in tuples (tables), trees, nested tuples (trees of tuples) or relational (SQL-like) databases, with or without defined schema. But many of our data are graph-like and schema-less. They consist of entities with relations, some of which are known in advance, but many are...

Descripción completa

Detalles Bibliográficos
Autor principal: Hrivnac, Julius
Lenguaje:eng
Publicado: 2020
Materias:
Acceso en línea:https://dx.doi.org/10.1051/epjconf/202024504004
http://cds.cern.ch/record/2712902
_version_ 1780965318584172544
author Hrivnac, Julius
author_facet Hrivnac, Julius
author_sort Hrivnac, Julius
collection CERN
description Data in HEP are usually stored in tuples (tables), trees, nested tuples (trees of tuples) or relational (SQL-like) databases, with or without defined schema. But many of our data are graph-like and schema-less. They consist of entities with relations, some of which are known in advance, but many are created ad-hoc, later. Such structures are not well covered by relational (SQL) databases. We don't need only a possibility to add new data with pre-defined relations. We need to add new relations. Graph databases exist since a long time. They have matured only recently thanks to Big Data and AI (adaptive NN). The are now very good implementations and de-facto standards available. The difference between SQL and Graph DB is similar as the difference between Fortran and C++. On one side, a rigid system, which can be very optimized. On the other side, a flexible dynamical system, which allows expressing of complex structures. GraphDB is a synthesis of OODB and SQLDB. They allow expressing web of objects without fragility of OO world. They capture only essential relations, they not keep a complete object dump. Migrating to Graphical database means moving structure from data to code, together with migration from imperative to declarative semantics (things don't happen, but exist). The paper describes basic principles of the Graph Database together with overview of existing standards and implementations. The usefulness and usability are demonstrated on the concrete example of the ATLAS Event Index in two approaches - as the full storage (all data are in the Graph Database) and meta-storage (a layer of schema-less graph-like data implemented on top of more traditional storage). The usability, the interfaces with the surrounding framework and the performance of those solution are be discussed. The possible more general usefulness for generic experiments' storage is also discussed.
id cern-2712902
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2020
record_format invenio
spelling cern-27129022021-03-22T22:08:54Zdoi:10.1051/epjconf/202024504004http://cds.cern.ch/record/2712902engHrivnac, JuliusUsing Graph DatabasesParticle Physics - ExperimentData in HEP are usually stored in tuples (tables), trees, nested tuples (trees of tuples) or relational (SQL-like) databases, with or without defined schema. But many of our data are graph-like and schema-less. They consist of entities with relations, some of which are known in advance, but many are created ad-hoc, later. Such structures are not well covered by relational (SQL) databases. We don't need only a possibility to add new data with pre-defined relations. We need to add new relations. Graph databases exist since a long time. They have matured only recently thanks to Big Data and AI (adaptive NN). The are now very good implementations and de-facto standards available. The difference between SQL and Graph DB is similar as the difference between Fortran and C++. On one side, a rigid system, which can be very optimized. On the other side, a flexible dynamical system, which allows expressing of complex structures. GraphDB is a synthesis of OODB and SQLDB. They allow expressing web of objects without fragility of OO world. They capture only essential relations, they not keep a complete object dump. Migrating to Graphical database means moving structure from data to code, together with migration from imperative to declarative semantics (things don't happen, but exist). The paper describes basic principles of the Graph Database together with overview of existing standards and implementations. The usefulness and usability are demonstrated on the concrete example of the ATLAS Event Index in two approaches - as the full storage (all data are in the Graph Database) and meta-storage (a layer of schema-less graph-like data implemented on top of more traditional storage). The usability, the interfaces with the surrounding framework and the performance of those solution are be discussed. The possible more general usefulness for generic experiments' storage is also discussed.ATL-SOFT-PROC-2020-025oai:cds.cern.ch:27129022020-03-13
spellingShingle Particle Physics - Experiment
Hrivnac, Julius
Using Graph Databases
title Using Graph Databases
title_full Using Graph Databases
title_fullStr Using Graph Databases
title_full_unstemmed Using Graph Databases
title_short Using Graph Databases
title_sort using graph databases
topic Particle Physics - Experiment
url https://dx.doi.org/10.1051/epjconf/202024504004
http://cds.cern.ch/record/2712902
work_keys_str_mv AT hrivnacjulius usinggraphdatabases