Cargando…

Laurelin: Java-native ROOT I/O for Apache Spark

<!--HTML-->Apache Spark is one of the predominant frameworks in the big data space, providing a fully-functional query processing engine, vendor support for hardware accelerators, and performant integrations with scientific computing libraries. One difficulty in adopting conventional big data...

Descripción completa

Detalles Bibliográficos
Autor principal: Melo, Andrew Malone
Lenguaje:eng
Publicado: 2021
Materias:
Acceso en línea:http://cds.cern.ch/record/2767271
_version_ 1780971287355588608
author Melo, Andrew Malone
author_facet Melo, Andrew Malone
author_sort Melo, Andrew Malone
collection CERN
description <!--HTML-->Apache Spark is one of the predominant frameworks in the big data space, providing a fully-functional query processing engine, vendor support for hardware accelerators, and performant integrations with scientific computing libraries. One difficulty in adopting conventional big data frameworks to HEP workflows is the lack of support for the ROOT file format in these frameworks. Laurelin implements ROOT I/O with a pure Java library, with no bindings to the C++ ROOT implementation, and is readily installable via standard Java packaging tools. It provides a performant interface enabling Spark to read (and soon write) ROOT TTrees, enabling users to process these data without a pre-processing phase converting to an intermediate format.
id cern-2767271
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2021
record_format invenio
spelling cern-27672712022-11-02T22:25:37Zhttp://cds.cern.ch/record/2767271engMelo, Andrew MaloneLaurelin: Java-native ROOT I/O for Apache Spark25th International Conference on Computing in High Energy & Nuclear PhysicsConferences<!--HTML-->Apache Spark is one of the predominant frameworks in the big data space, providing a fully-functional query processing engine, vendor support for hardware accelerators, and performant integrations with scientific computing libraries. One difficulty in adopting conventional big data frameworks to HEP workflows is the lack of support for the ROOT file format in these frameworks. Laurelin implements ROOT I/O with a pure Java library, with no bindings to the C++ ROOT implementation, and is readily installable via standard Java packaging tools. It provides a performant interface enabling Spark to read (and soon write) ROOT TTrees, enabling users to process these data without a pre-processing phase converting to an intermediate format.oai:cds.cern.ch:27672712021
spellingShingle Conferences
Melo, Andrew Malone
Laurelin: Java-native ROOT I/O for Apache Spark
title Laurelin: Java-native ROOT I/O for Apache Spark
title_full Laurelin: Java-native ROOT I/O for Apache Spark
title_fullStr Laurelin: Java-native ROOT I/O for Apache Spark
title_full_unstemmed Laurelin: Java-native ROOT I/O for Apache Spark
title_short Laurelin: Java-native ROOT I/O for Apache Spark
title_sort laurelin: java-native root i/o for apache spark
topic Conferences
url http://cds.cern.ch/record/2767271
work_keys_str_mv AT meloandrewmalone laurelinjavanativerootioforapachespark
AT meloandrewmalone 25thinternationalconferenceoncomputinginhighenergynuclearphysics