Cargando…
Laurelin: Java-native ROOT I/O for Apache Spark
<!--HTML-->Apache Spark is one of the predominant frameworks in the big data space, providing a fully-functional query processing engine, vendor support for hardware accelerators, and performant integrations with scientific computing libraries. One difficulty in adopting conventional big data...
Autor principal: | |
---|---|
Lenguaje: | eng |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2767271 |
_version_ | 1780971287355588608 |
---|---|
author | Melo, Andrew Malone |
author_facet | Melo, Andrew Malone |
author_sort | Melo, Andrew Malone |
collection | CERN |
description | <!--HTML-->Apache Spark is one of the predominant frameworks in the big data space, providing a fully-functional query processing engine, vendor support for hardware accelerators, and performant integrations with scientific computing libraries. One difficulty in adopting conventional big data frameworks to HEP workflows is the lack of support for the ROOT file format in these frameworks. Laurelin implements ROOT I/O with a pure Java library, with no bindings to the C++ ROOT implementation, and is readily installable via standard Java packaging tools. It provides a performant interface enabling Spark to read (and soon write) ROOT TTrees, enabling users to process these data without a pre-processing phase converting to an intermediate format. |
id | cern-2767271 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2021 |
record_format | invenio |
spelling | cern-27672712022-11-02T22:25:37Zhttp://cds.cern.ch/record/2767271engMelo, Andrew MaloneLaurelin: Java-native ROOT I/O for Apache Spark25th International Conference on Computing in High Energy & Nuclear PhysicsConferences<!--HTML-->Apache Spark is one of the predominant frameworks in the big data space, providing a fully-functional query processing engine, vendor support for hardware accelerators, and performant integrations with scientific computing libraries. One difficulty in adopting conventional big data frameworks to HEP workflows is the lack of support for the ROOT file format in these frameworks. Laurelin implements ROOT I/O with a pure Java library, with no bindings to the C++ ROOT implementation, and is readily installable via standard Java packaging tools. It provides a performant interface enabling Spark to read (and soon write) ROOT TTrees, enabling users to process these data without a pre-processing phase converting to an intermediate format.oai:cds.cern.ch:27672712021 |
spellingShingle | Conferences Melo, Andrew Malone Laurelin: Java-native ROOT I/O for Apache Spark |
title | Laurelin: Java-native ROOT I/O for Apache Spark |
title_full | Laurelin: Java-native ROOT I/O for Apache Spark |
title_fullStr | Laurelin: Java-native ROOT I/O for Apache Spark |
title_full_unstemmed | Laurelin: Java-native ROOT I/O for Apache Spark |
title_short | Laurelin: Java-native ROOT I/O for Apache Spark |
title_sort | laurelin: java-native root i/o for apache spark |
topic | Conferences |
url | http://cds.cern.ch/record/2767271 |
work_keys_str_mv | AT meloandrewmalone laurelinjavanativerootioforapachespark AT meloandrewmalone 25thinternationalconferenceoncomputinginhighenergynuclearphysics |