Cargando…
Hadoop Tutorials - Hadoop Foundations
<!--HTML--><p>The <strong>Hadoop</strong> ecosystem is the leading opensource platform for distributed storage and processing of "big data". The Hadoop platform is available at CERN as a central service provided by the IT department.</p> <p>This...
Autores principales: | , |
---|---|
Lenguaje: | eng |
Publicado: |
2016
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2197972 |
_version_ | 1780951211381358592 |
---|---|
author | Baranowski, Zbigniew Lanza Garcia, Daniel |
author_facet | Baranowski, Zbigniew Lanza Garcia, Daniel |
author_sort | Baranowski, Zbigniew |
collection | CERN |
description | <!--HTML--><p>The <strong>Hadoop</strong> ecosystem is the leading opensource platform for distributed storage and processing of "big data". The Hadoop platform is available at CERN as a central service provided by the IT department.</p>
<p>This tutorial organized by the IT Hadoop service, aims to introduce the main concepts about Hadoop technology in a practical way and is targeted to those who would like to <strong>start using the service for distributed parallel data processing</strong>.</p>
<p>The main <strong>topics </strong>that will be covered are:</p>
<ul>
<li>Hadoop <strong>architecture </strong>and available components</li>
<li>How to perform distributed parallel processing in order to explore and create reports with SQL (with <strong>Apache Impala</strong>) on example data.</li>
<li>Using a HUE - <strong>Hadoop web UI</strong> for presenting the results in user friendly way.</li>
<li>How to format and/or structure data in order to make data processing more efficient - by using various data formats/containers and partitioning techniques (<strong>Avro, Parquet, HBase</strong>). Best practices in this area will be also discussed</li>
</ul>
<p> </p>
<p>Attendees will have the possibility to access a <strong>test Hadoop</strong> system where they will be able to perform hands-on exercises. Instructions will be provided by the speakers. To facilitate the preparation of the test environment, <strong>please register</strong> if you plan to attend.</p> |
id | cern-2197972 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2016 |
record_format | invenio |
spelling | cern-21979722022-11-02T22:18:48Zhttp://cds.cern.ch/record/2197972engBaranowski, ZbigniewLanza Garcia, DanielHadoop Tutorials - Hadoop FoundationsHadoop Tutorials - Hadoop FoundationsWorkshops<!--HTML--><p>The <strong>Hadoop</strong> ecosystem is the leading opensource platform for distributed storage and processing of "big data". The Hadoop platform is available at CERN as a central service provided by the IT department.</p> <p>This tutorial organized by the IT Hadoop service, aims to introduce the main concepts about Hadoop technology in a practical way and is targeted to those who would like to <strong>start using the service for distributed parallel data processing</strong>.</p> <p>The main <strong>topics </strong>that will be covered are:</p> <ul> <li>Hadoop <strong>architecture </strong>and available components</li> <li>How to perform distributed parallel processing in order to explore and create reports with SQL (with <strong>Apache Impala</strong>) on example data.</li> <li>Using a HUE - <strong>Hadoop web UI</strong> for presenting the results in user friendly way.</li> <li>How to format and/or structure data in order to make data processing more efficient - by using various data formats/containers and partitioning techniques (<strong>Avro, Parquet, HBase</strong>). Best practices in this area will be also discussed</li> </ul> <p> </p> <p>Attendees will have the possibility to access a <strong>test Hadoop</strong> system where they will be able to perform hands-on exercises. Instructions will be provided by the speakers. To facilitate the preparation of the test environment, <strong>please register</strong> if you plan to attend.</p>oai:cds.cern.ch:21979722016 |
spellingShingle | Workshops Baranowski, Zbigniew Lanza Garcia, Daniel Hadoop Tutorials - Hadoop Foundations |
title | Hadoop Tutorials - Hadoop Foundations |
title_full | Hadoop Tutorials - Hadoop Foundations |
title_fullStr | Hadoop Tutorials - Hadoop Foundations |
title_full_unstemmed | Hadoop Tutorials - Hadoop Foundations |
title_short | Hadoop Tutorials - Hadoop Foundations |
title_sort | hadoop tutorials - hadoop foundations |
topic | Workshops |
url | http://cds.cern.ch/record/2197972 |
work_keys_str_mv | AT baranowskizbigniew hadooptutorialshadoopfoundations AT lanzagarciadaniel hadooptutorialshadoopfoundations |