Cargando…

Hadoop Tutorials - Hadoop Foundations

<!--HTML--><p>The <strong>Hadoop</strong> ecosystem is the leading opensource platform for distributed storage and processing of &quot;big data&quot;. The Hadoop platform is available at CERN as a central service provided by the IT department.</p> <p>This...

Descripción completa

Detalles Bibliográficos
Autores principales: Baranowski, Zbigniew, Lanza Garcia, Daniel
Lenguaje:eng
Publicado: 2016
Materias:
Acceso en línea:http://cds.cern.ch/record/2197972
_version_ 1780951211381358592
author Baranowski, Zbigniew
Lanza Garcia, Daniel
author_facet Baranowski, Zbigniew
Lanza Garcia, Daniel
author_sort Baranowski, Zbigniew
collection CERN
description <!--HTML--><p>The <strong>Hadoop</strong> ecosystem is the leading opensource platform for distributed storage and processing of &quot;big data&quot;. The Hadoop platform is available at CERN as a central service provided by the IT department.</p> <p>This tutorial&nbsp;organized by the IT Hadoop service, aims to introduce the main concepts about Hadoop technology in a practical way and is targeted to those who would like to <strong>start using the service for distributed parallel&nbsp;data processing</strong>.</p> <p>The main <strong>topics </strong>that will be covered are:</p> <ul> <li>Hadoop <strong>architecture </strong>and available components</li> <li>How to perform distributed parallel processing&nbsp;in order to explore&nbsp;and create reports&nbsp;with SQL&nbsp;(with <strong>Apache Impala</strong>) on example data.</li> <li>Using a HUE - <strong>Hadoop web UI</strong> for presenting the results in user friendly way.</li> <li>How to&nbsp;format and/or&nbsp;structure data in order to make data processing more efficient - by using various data formats/containers and partitioning techniques (<strong>Avro, Parquet, HBase</strong>).&nbsp;Best practices in this area will be also discussed</li> </ul> <p>&nbsp;</p> <p>Attendees will have the possibility to access a <strong>test Hadoop</strong> system where they will be able to perform hands-on exercises. Instructions will be provided by the speakers. To facilitate the preparation of the test environment, <strong>please register</strong> if you plan to attend.</p>
id cern-2197972
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2016
record_format invenio
spelling cern-21979722022-11-02T22:18:48Zhttp://cds.cern.ch/record/2197972engBaranowski, ZbigniewLanza Garcia, DanielHadoop Tutorials - Hadoop FoundationsHadoop Tutorials - Hadoop FoundationsWorkshops<!--HTML--><p>The <strong>Hadoop</strong> ecosystem is the leading opensource platform for distributed storage and processing of &quot;big data&quot;. The Hadoop platform is available at CERN as a central service provided by the IT department.</p> <p>This tutorial&nbsp;organized by the IT Hadoop service, aims to introduce the main concepts about Hadoop technology in a practical way and is targeted to those who would like to <strong>start using the service for distributed parallel&nbsp;data processing</strong>.</p> <p>The main <strong>topics </strong>that will be covered are:</p> <ul> <li>Hadoop <strong>architecture </strong>and available components</li> <li>How to perform distributed parallel processing&nbsp;in order to explore&nbsp;and create reports&nbsp;with SQL&nbsp;(with <strong>Apache Impala</strong>) on example data.</li> <li>Using a HUE - <strong>Hadoop web UI</strong> for presenting the results in user friendly way.</li> <li>How to&nbsp;format and/or&nbsp;structure data in order to make data processing more efficient - by using various data formats/containers and partitioning techniques (<strong>Avro, Parquet, HBase</strong>).&nbsp;Best practices in this area will be also discussed</li> </ul> <p>&nbsp;</p> <p>Attendees will have the possibility to access a <strong>test Hadoop</strong> system where they will be able to perform hands-on exercises. Instructions will be provided by the speakers. To facilitate the preparation of the test environment, <strong>please register</strong> if you plan to attend.</p>oai:cds.cern.ch:21979722016
spellingShingle Workshops
Baranowski, Zbigniew
Lanza Garcia, Daniel
Hadoop Tutorials - Hadoop Foundations
title Hadoop Tutorials - Hadoop Foundations
title_full Hadoop Tutorials - Hadoop Foundations
title_fullStr Hadoop Tutorials - Hadoop Foundations
title_full_unstemmed Hadoop Tutorials - Hadoop Foundations
title_short Hadoop Tutorials - Hadoop Foundations
title_sort hadoop tutorials - hadoop foundations
topic Workshops
url http://cds.cern.ch/record/2197972
work_keys_str_mv AT baranowskizbigniew hadooptutorialshadoopfoundations
AT lanzagarciadaniel hadooptutorialshadoopfoundations