Cargando…

Scale out databases for CERN use cases

Data generation rates are expected to grow very fast for some database workloads going into LHC run 2 and beyond. In particular this is expected for data coming from controls, logging and monitoring systems. Storing, administering and accessing big data sets in a relational database system can quick...

Descripción completa

Detalles Bibliográficos
Autores principales: Baranowski, Zbigniew, Grzybek, Maciej, Canali, Luca, Garcia, Daniel Lanza, Surdy, Kacper
Lenguaje:eng
Publicado: 2015
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/664/4/042002
http://cds.cern.ch/record/2134552
_version_ 1780949903648751616
author Baranowski, Zbigniew
Grzybek, Maciej
Canali, Luca
Garcia, Daniel Lanza
Surdy, Kacper
author_facet Baranowski, Zbigniew
Grzybek, Maciej
Canali, Luca
Garcia, Daniel Lanza
Surdy, Kacper
author_sort Baranowski, Zbigniew
collection CERN
description Data generation rates are expected to grow very fast for some database workloads going into LHC run 2 and beyond. In particular this is expected for data coming from controls, logging and monitoring systems. Storing, administering and accessing big data sets in a relational database system can quickly become a very hard technical challenge, as the size of the active data set and the number of concurrent users increase. Scale-out database technologies are a rapidly developing set of solutions for deploying and managing very large data warehouses on commodity hardware and with open source software. In this paper we will describe the architecture and tests on database systems based on Hadoop and the Cloudera Impala engine. We will discuss the results of our tests, including tests of data loading and integration with existing data sources and in particular with relational databases. We will report on query performance tests done with various data sets of interest at CERN, notably data from the accelerator log database.
id oai-inspirehep.net-1413834
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2015
record_format invenio
spelling oai-inspirehep.net-14138342022-08-10T13:00:51Zdoi:10.1088/1742-6596/664/4/042002http://cds.cern.ch/record/2134552engBaranowski, ZbigniewGrzybek, MaciejCanali, LucaGarcia, Daniel LanzaSurdy, KacperScale out databases for CERN use casesComputing and ComputersData generation rates are expected to grow very fast for some database workloads going into LHC run 2 and beyond. In particular this is expected for data coming from controls, logging and monitoring systems. Storing, administering and accessing big data sets in a relational database system can quickly become a very hard technical challenge, as the size of the active data set and the number of concurrent users increase. Scale-out database technologies are a rapidly developing set of solutions for deploying and managing very large data warehouses on commodity hardware and with open source software. In this paper we will describe the architecture and tests on database systems based on Hadoop and the Cloudera Impala engine. We will discuss the results of our tests, including tests of data loading and integration with existing data sources and in particular with relational databases. We will report on query performance tests done with various data sets of interest at CERN, notably data from the accelerator log database.oai:inspirehep.net:14138342015
spellingShingle Computing and Computers
Baranowski, Zbigniew
Grzybek, Maciej
Canali, Luca
Garcia, Daniel Lanza
Surdy, Kacper
Scale out databases for CERN use cases
title Scale out databases for CERN use cases
title_full Scale out databases for CERN use cases
title_fullStr Scale out databases for CERN use cases
title_full_unstemmed Scale out databases for CERN use cases
title_short Scale out databases for CERN use cases
title_sort scale out databases for cern use cases
topic Computing and Computers
url https://dx.doi.org/10.1088/1742-6596/664/4/042002
http://cds.cern.ch/record/2134552
work_keys_str_mv AT baranowskizbigniew scaleoutdatabasesforcernusecases
AT grzybekmaciej scaleoutdatabasesforcernusecases
AT canaliluca scaleoutdatabasesforcernusecases
AT garciadaniellanza scaleoutdatabasesforcernusecases
AT surdykacper scaleoutdatabasesforcernusecases