Cargando…

Laurelin: Java-native ROOT I/O for Apache Spark

Apache Spark is one of the predominant frameworks in the big data space, providing a fully-functional query processing engine, vendor support for hardware accelerators, and performant integrations with scientific computing libraries. One difficulty in adopting conventional big data...

Descripción completa

Detalles Bibliográficos
Autor principal:	Melo, Andrew Malone
Lenguaje:	eng
Publicado:	2021
Materias:	Conferences
Acceso en línea:	http://cds.cern.ch/record/2767271

_version_	1780971287355588608
author	Melo, Andrew Malone
author_facet	Melo, Andrew Malone
author_sort	Melo, Andrew Malone
collection	CERN
description	<!--HTML-->Apache Spark is one of the predominant frameworks in the big data space, providing a fully-functional query processing engine, vendor support for hardware accelerators, and performant integrations with scientific computing libraries. One difficulty in adopting conventional big data frameworks to HEP workflows is the lack of support for the ROOT file format in these frameworks. Laurelin implements ROOT I/O with a pure Java library, with no bindings to the C++ ROOT implementation, and is readily installable via standard Java packaging tools. It provides a performant interface enabling Spark to read (and soon write) ROOT TTrees, enabling users to process these data without a pre-processing phase converting to an intermediate format.
id	cern-2767271
institution	Organización Europea para la Investigación Nuclear
language	eng
publishDate	2021
record_format	invenio
spelling	cern-27672712022-11-02T22:25:37Zhttp://cds.cern.ch/record/2767271engMelo, Andrew MaloneLaurelin: Java-native ROOT I/O for Apache Spark25th International Conference on Computing in High Energy & Nuclear PhysicsConferences<!--HTML-->Apache Spark is one of the predominant frameworks in the big data space, providing a fully-functional query processing engine, vendor support for hardware accelerators, and performant integrations with scientific computing libraries. One difficulty in adopting conventional big data frameworks to HEP workflows is the lack of support for the ROOT file format in these frameworks. Laurelin implements ROOT I/O with a pure Java library, with no bindings to the C++ ROOT implementation, and is readily installable via standard Java packaging tools. It provides a performant interface enabling Spark to read (and soon write) ROOT TTrees, enabling users to process these data without a pre-processing phase converting to an intermediate format.oai:cds.cern.ch:27672712021
spellingShingle	Conferences Melo, Andrew Malone Laurelin: Java-native ROOT I/O for Apache Spark
title	Laurelin: Java-native ROOT I/O for Apache Spark
title_full	Laurelin: Java-native ROOT I/O for Apache Spark
title_fullStr	Laurelin: Java-native ROOT I/O for Apache Spark
title_full_unstemmed	Laurelin: Java-native ROOT I/O for Apache Spark
title_short	Laurelin: Java-native ROOT I/O for Apache Spark
title_sort	laurelin: java-native root i/o for apache spark
topic	Conferences
url	http://cds.cern.ch/record/2767271
work_keys_str_mv	AT meloandrewmalone laurelinjavanativerootioforapachespark AT meloandrewmalone 25thinternationalconferenceoncomputinginhighenergynuclearphysics

Laurelin: Java-native ROOT I/O for Apache Spark

Ejemplares similares