Cargando…
Big Data technologies and distributed data processing with SQL (exercise consultation)
<!--HTML-->The interest of many users communities in solutions based on the Big Data ecosystem such as Hadoop is constantly increasing at CERN, including physics experiments, monitoring, and accelerator controls. This lecture will introduce the participant to some of the Big Data technologie...
Autor principal: | |
---|---|
Lenguaje: | eng |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2739361 |
_version_ | 1780968129595179008 |
---|---|
author | Kleszcz, Emil |
author_facet | Kleszcz, Emil |
author_sort | Kleszcz, Emil |
collection | CERN |
description | <!--HTML-->The interest of many users communities in solutions based on the Big Data ecosystem such as Hadoop is constantly increasing at CERN, including physics experiments, monitoring, and accelerator controls.
This lecture will introduce the participant to some of the Big Data technologies that CERN offers for distributed processing. It will cover some history, architecture, and specifics of selected technologies such as the Hadoop, Spark, and Presto for SQL-like data processing.
Moreover, this talk will cover the example of using the Big Data tools to speedup some computations on hundreds of terabytes of events coming from the Atlas experiment at CERN. |
id | cern-2739361 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2020 |
record_format | invenio |
spelling | cern-27393612022-11-02T22:36:24Zhttp://cds.cern.ch/record/2739361engKleszcz, EmilBig Data technologies and distributed data processing with SQL (exercise consultation)Inverted CERN School of Computing 2020Inverted CSC<!--HTML-->The interest of many users communities in solutions based on the Big Data ecosystem such as Hadoop is constantly increasing at CERN, including physics experiments, monitoring, and accelerator controls. This lecture will introduce the participant to some of the Big Data technologies that CERN offers for distributed processing. It will cover some history, architecture, and specifics of selected technologies such as the Hadoop, Spark, and Presto for SQL-like data processing. Moreover, this talk will cover the example of using the Big Data tools to speedup some computations on hundreds of terabytes of events coming from the Atlas experiment at CERN.oai:cds.cern.ch:27393612020 |
spellingShingle | Inverted CSC Kleszcz, Emil Big Data technologies and distributed data processing with SQL (exercise consultation) |
title | Big Data technologies and distributed data processing with SQL (exercise consultation) |
title_full | Big Data technologies and distributed data processing with SQL (exercise consultation) |
title_fullStr | Big Data technologies and distributed data processing with SQL (exercise consultation) |
title_full_unstemmed | Big Data technologies and distributed data processing with SQL (exercise consultation) |
title_short | Big Data technologies and distributed data processing with SQL (exercise consultation) |
title_sort | big data technologies and distributed data processing with sql (exercise consultation) |
topic | Inverted CSC |
url | http://cds.cern.ch/record/2739361 |
work_keys_str_mv | AT kleszczemil bigdatatechnologiesanddistributeddataprocessingwithsqlexerciseconsultation AT kleszczemil invertedcernschoolofcomputing2020 |