Cargando…

Big Data technologies and distributed data processing with SQL (lecture)

<!--HTML-->The interest of many users communities in solutions based on the Big Data ecosystem such as Hadoop is constantly increasing at CERN, including physics experiments, monitoring, and accelerator controls. This lecture will introduce the participant to some of the Big Data technologie...

Descripción completa

Detalles Bibliográficos
Autor principal: Kleszcz, Emil
Lenguaje:eng
Publicado: 2020
Materias:
Acceso en línea:http://cds.cern.ch/record/2737354
_version_ 1780967723911610368
author Kleszcz, Emil
author_facet Kleszcz, Emil
author_sort Kleszcz, Emil
collection CERN
description <!--HTML-->The interest of many users communities in solutions based on the Big Data ecosystem such as Hadoop is constantly increasing at CERN, including physics experiments, monitoring, and accelerator controls. This lecture will introduce the participant to some of the Big Data technologies that CERN offers for distributed processing. It will cover some history, architecture, and specifics of selected technologies such as the Hadoop, Spark, and Presto for SQL-like data processing. Moreover, this talk will cover the example of using the Big Data tools to speedup some computations on hundreds of terabytes of events coming from the Atlas experiment at CERN.
id cern-2737354
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2020
record_format invenio
spelling cern-27373542022-11-02T22:36:24Zhttp://cds.cern.ch/record/2737354engKleszcz, EmilBig Data technologies and distributed data processing with SQL (lecture)Inverted CERN School of Computing 2020Inverted CSC<!--HTML-->The interest of many users communities in solutions based on the Big Data ecosystem such as Hadoop is constantly increasing at CERN, including physics experiments, monitoring, and accelerator controls. This lecture will introduce the participant to some of the Big Data technologies that CERN offers for distributed processing. It will cover some history, architecture, and specifics of selected technologies such as the Hadoop, Spark, and Presto for SQL-like data processing. Moreover, this talk will cover the example of using the Big Data tools to speedup some computations on hundreds of terabytes of events coming from the Atlas experiment at CERN.oai:cds.cern.ch:27373542020
spellingShingle Inverted CSC
Kleszcz, Emil
Big Data technologies and distributed data processing with SQL (lecture)
title Big Data technologies and distributed data processing with SQL (lecture)
title_full Big Data technologies and distributed data processing with SQL (lecture)
title_fullStr Big Data technologies and distributed data processing with SQL (lecture)
title_full_unstemmed Big Data technologies and distributed data processing with SQL (lecture)
title_short Big Data technologies and distributed data processing with SQL (lecture)
title_sort big data technologies and distributed data processing with sql (lecture)
topic Inverted CSC
url http://cds.cern.ch/record/2737354
work_keys_str_mv AT kleszczemil bigdatatechnologiesanddistributeddataprocessingwithsqllecture
AT kleszczemil invertedcernschoolofcomputing2020