Cargando…

Columnar data analysis with ATLAS analysis formats

Future analysis of ATLAS data will involve new small-sized analysis formats to cope with the increased storage needs. The smallest of these, named DAOD_PHYSLITE, has calibrations already applied to allow fast downstream analysis and avoid the need for further analysis-specific intermediate formats....

Descripción completa

Detalles Bibliográficos
Autores principales: Hartmann, Nikolai, Duckeck, Guenter, Elmsheuser, Johannes
Lenguaje:eng
Publicado: 2021
Materias:
Acceso en línea:http://cds.cern.ch/record/2765404
_version_ 1780971146997399552
author Hartmann, Nikolai
Duckeck, Guenter
Elmsheuser, Johannes
author_facet Hartmann, Nikolai
Duckeck, Guenter
Elmsheuser, Johannes
author_sort Hartmann, Nikolai
collection CERN
description Future analysis of ATLAS data will involve new small-sized analysis formats to cope with the increased storage needs. The smallest of these, named DAOD_PHYSLITE, has calibrations already applied to allow fast downstream analysis and avoid the need for further analysis-specific intermediate formats. This allows for application of the “columnar analysis” paradigm where operations are applied on a per-array instead of a per-event basis. We will present methods to read the data into memory, using Uproot, and also discuss I/O aspects of columnar data and alternatives to the ROOT data format. Furthermore, we will show a representation of the event data model using the Awkward Array package and present proof of concept for a simple analysis application.
id cern-2765404
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2021
record_format invenio
spelling cern-27654042022-08-23T08:21:57Zhttp://cds.cern.ch/record/2765404engHartmann, NikolaiDuckeck, GuenterElmsheuser, JohannesColumnar data analysis with ATLAS analysis formatsParticle Physics - ExperimentFuture analysis of ATLAS data will involve new small-sized analysis formats to cope with the increased storage needs. The smallest of these, named DAOD_PHYSLITE, has calibrations already applied to allow fast downstream analysis and avoid the need for further analysis-specific intermediate formats. This allows for application of the “columnar analysis” paradigm where operations are applied on a per-array instead of a per-event basis. We will present methods to read the data into memory, using Uproot, and also discuss I/O aspects of columnar data and alternatives to the ROOT data format. Furthermore, we will show a representation of the event data model using the Awkward Array package and present proof of concept for a simple analysis application.ATL-SOFT-SLIDE-2021-122oai:cds.cern.ch:27654042021-04-28
spellingShingle Particle Physics - Experiment
Hartmann, Nikolai
Duckeck, Guenter
Elmsheuser, Johannes
Columnar data analysis with ATLAS analysis formats
title Columnar data analysis with ATLAS analysis formats
title_full Columnar data analysis with ATLAS analysis formats
title_fullStr Columnar data analysis with ATLAS analysis formats
title_full_unstemmed Columnar data analysis with ATLAS analysis formats
title_short Columnar data analysis with ATLAS analysis formats
title_sort columnar data analysis with atlas analysis formats
topic Particle Physics - Experiment
url http://cds.cern.ch/record/2765404
work_keys_str_mv AT hartmannnikolai columnardataanalysiswithatlasanalysisformats
AT duckeckguenter columnardataanalysiswithatlasanalysisformats
AT elmsheuserjohannes columnardataanalysiswithatlasanalysisformats