Cargando…

Evolution of the ROOT Tree I/O

The ROOT TTree data format encodes hundreds of petabytes of High Energy and Nuclear Physics events. Its columnar layout drives rapid analyses, as only those parts (“branches”) that are really used in a given analysis need to be read from storage. Its unique feature is the seamless C++ integration, w...

Descripción completa

Detalles Bibliográficos
Autores principales: Blomer, Jakob, Canal, Philippe, Naumann, Axel, Piparo, Danilo
Lenguaje:eng
Publicado: 2020
Materias:
Acceso en línea:https://dx.doi.org/10.1051/epjconf/202024502030
http://cds.cern.ch/record/2715450
_version_ 1780965434851328000
author Blomer, Jakob
Canal, Philippe
Naumann, Axel
Piparo, Danilo
author_facet Blomer, Jakob
Canal, Philippe
Naumann, Axel
Piparo, Danilo
author_sort Blomer, Jakob
collection CERN
description The ROOT TTree data format encodes hundreds of petabytes of High Energy and Nuclear Physics events. Its columnar layout drives rapid analyses, as only those parts (“branches”) that are really used in a given analysis need to be read from storage. Its unique feature is the seamless C++ integration, which allows users to directly store their event classes without explicitly defining data schemas. In this contribution, we present the status and plans of the future ROOT 7 event I/O. Along with the ROOT 7 interface modernization, we aim for robust, where possible compile-time safe C++ interfaces to read and write event data. On the performance side, we show first benchmarks using ROOT’s new experimental I/O subsystem that combines the best of TTrees with recent advances in columnar data formats. A core ingredient is a strong separation of the high-level logical data layout (C++ classes) from the low-level physical data layout (storage backed nested vectors of simple types). We show how the new, optimized physical data layout speeds up serialization and deserialization and facilitates parallel, vectorized and bulk operations. This lets ROOT I/O run optimally on the upcoming ultra-fast NVRAM storage devices, as well as file-less storage systems such as object stores.
id cern-2715450
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2020
record_format invenio
spelling cern-27154502021-02-27T05:02:40Zdoi:10.1051/epjconf/202024502030http://cds.cern.ch/record/2715450engBlomer, JakobCanal, PhilippeNaumann, AxelPiparo, DaniloEvolution of the ROOT Tree I/Ohep-exParticle Physics - Experimentcs.DBComputing and ComputersThe ROOT TTree data format encodes hundreds of petabytes of High Energy and Nuclear Physics events. Its columnar layout drives rapid analyses, as only those parts (“branches”) that are really used in a given analysis need to be read from storage. Its unique feature is the seamless C++ integration, which allows users to directly store their event classes without explicitly defining data schemas. In this contribution, we present the status and plans of the future ROOT 7 event I/O. Along with the ROOT 7 interface modernization, we aim for robust, where possible compile-time safe C++ interfaces to read and write event data. On the performance side, we show first benchmarks using ROOT’s new experimental I/O subsystem that combines the best of TTrees with recent advances in columnar data formats. A core ingredient is a strong separation of the high-level logical data layout (C++ classes) from the low-level physical data layout (storage backed nested vectors of simple types). We show how the new, optimized physical data layout speeds up serialization and deserialization and facilitates parallel, vectorized and bulk operations. This lets ROOT I/O run optimally on the upcoming ultra-fast NVRAM storage devices, as well as file-less storage systems such as object stores.The ROOT TTree data format encodes hundreds of petabytes of High Energy and Nuclear Physics events. Its columnar layout drives rapid analyses, as only those parts ("branches") that are really used in a given analysis need to be read from storage. Its unique feature is the seamless C++ integration, which allows users to directly store their event classes without explicitly defining data schemas. In this contribution, we present the status and plans of the future ROOT 7 event I/O. Along with the ROOT 7 interface modernization, we aim for robust, where possible compile-time safe C++ interfaces to read and write event data. On the performance side, we show first benchmarks using ROOT's new experimental I/O subsystem that combines the best of TTrees with recent advances in columnar data formats. A core ingredient is a strong separation of the high-level logical data layout (C++ classes) from the low-level physical data layout (storage backed nested vectors of simple types). We show how the new, optimized physical data layout speeds up serialization and deserialization and facilitates parallel, vectorized and bulk operations. This lets ROOT I/O run optimally on the upcoming ultra-fast NVRAM storage devices, as well as file-less storage systems such as object stores.arXiv:2003.07669FERMILAB-CONF-20-165-SCDoai:cds.cern.ch:27154502020
spellingShingle hep-ex
Particle Physics - Experiment
cs.DB
Computing and Computers
Blomer, Jakob
Canal, Philippe
Naumann, Axel
Piparo, Danilo
Evolution of the ROOT Tree I/O
title Evolution of the ROOT Tree I/O
title_full Evolution of the ROOT Tree I/O
title_fullStr Evolution of the ROOT Tree I/O
title_full_unstemmed Evolution of the ROOT Tree I/O
title_short Evolution of the ROOT Tree I/O
title_sort evolution of the root tree i/o
topic hep-ex
Particle Physics - Experiment
cs.DB
Computing and Computers
url https://dx.doi.org/10.1051/epjconf/202024502030
http://cds.cern.ch/record/2715450
work_keys_str_mv AT blomerjakob evolutionoftheroottreeio
AT canalphilippe evolutionoftheroottreeio
AT naumannaxel evolutionoftheroottreeio
AT piparodanilo evolutionoftheroottreeio