Cargando…

LHC physics dataset for unsupervised New Physics detection at 40 MHz

In the particle detectors at the Large Hadron Collider, hundreds of millions of proton-proton collisions are produced every second. If one could store the whole data stream produced in these collisions, tens of terabytes of data would be written to disk every second. The general-purpose experiments...

Descripción completa

Detalles Bibliográficos
Autores principales: Govorkova, Ekaterina, Puljak, Ema, Aarrestad, Thea, Pierini, Maurizio, Woźniak, Kinga Anna, Ngadiuba, Jennifer
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9070018/
https://www.ncbi.nlm.nih.gov/pubmed/35351897
http://dx.doi.org/10.1038/s41597-022-01187-8
_version_ 1784700554270736384
author Govorkova, Ekaterina
Puljak, Ema
Aarrestad, Thea
Pierini, Maurizio
Woźniak, Kinga Anna
Ngadiuba, Jennifer
author_facet Govorkova, Ekaterina
Puljak, Ema
Aarrestad, Thea
Pierini, Maurizio
Woźniak, Kinga Anna
Ngadiuba, Jennifer
author_sort Govorkova, Ekaterina
collection PubMed
description In the particle detectors at the Large Hadron Collider, hundreds of millions of proton-proton collisions are produced every second. If one could store the whole data stream produced in these collisions, tens of terabytes of data would be written to disk every second. The general-purpose experiments ATLAS and CMS reduce this overwhelming data volume to a sustainable level, by deciding in real-time whether each collision event should be kept for further analysis or be discarded. We introduce a dataset of proton collision events that emulates a typical data stream collected by such a real-time processing system, pre-filtered by requiring the presence of at least one electron or muon. This dataset could be used to develop novel event selection strategies and assess their sensitivity to new phenomena. In particular, we intend to stimulate a community-based effort towards the design of novel algorithms for performing unsupervised new physics detection, customized to fit the bandwidth, latency and computational resource constraints of the real-time event selection system of a typical particle detector.
format Online
Article
Text
id pubmed-9070018
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-90700182022-05-05 LHC physics dataset for unsupervised New Physics detection at 40 MHz Govorkova, Ekaterina Puljak, Ema Aarrestad, Thea Pierini, Maurizio Woźniak, Kinga Anna Ngadiuba, Jennifer Sci Data Data Descriptor In the particle detectors at the Large Hadron Collider, hundreds of millions of proton-proton collisions are produced every second. If one could store the whole data stream produced in these collisions, tens of terabytes of data would be written to disk every second. The general-purpose experiments ATLAS and CMS reduce this overwhelming data volume to a sustainable level, by deciding in real-time whether each collision event should be kept for further analysis or be discarded. We introduce a dataset of proton collision events that emulates a typical data stream collected by such a real-time processing system, pre-filtered by requiring the presence of at least one electron or muon. This dataset could be used to develop novel event selection strategies and assess their sensitivity to new phenomena. In particular, we intend to stimulate a community-based effort towards the design of novel algorithms for performing unsupervised new physics detection, customized to fit the bandwidth, latency and computational resource constraints of the real-time event selection system of a typical particle detector. Nature Publishing Group UK 2022-03-29 /pmc/articles/PMC9070018/ /pubmed/35351897 http://dx.doi.org/10.1038/s41597-022-01187-8 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Data Descriptor
Govorkova, Ekaterina
Puljak, Ema
Aarrestad, Thea
Pierini, Maurizio
Woźniak, Kinga Anna
Ngadiuba, Jennifer
LHC physics dataset for unsupervised New Physics detection at 40 MHz
title LHC physics dataset for unsupervised New Physics detection at 40 MHz
title_full LHC physics dataset for unsupervised New Physics detection at 40 MHz
title_fullStr LHC physics dataset for unsupervised New Physics detection at 40 MHz
title_full_unstemmed LHC physics dataset for unsupervised New Physics detection at 40 MHz
title_short LHC physics dataset for unsupervised New Physics detection at 40 MHz
title_sort lhc physics dataset for unsupervised new physics detection at 40 mhz
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9070018/
https://www.ncbi.nlm.nih.gov/pubmed/35351897
http://dx.doi.org/10.1038/s41597-022-01187-8
work_keys_str_mv AT govorkovaekaterina lhcphysicsdatasetforunsupervisednewphysicsdetectionat40mhz
AT puljakema lhcphysicsdatasetforunsupervisednewphysicsdetectionat40mhz
AT aarrestadthea lhcphysicsdatasetforunsupervisednewphysicsdetectionat40mhz
AT pierinimaurizio lhcphysicsdatasetforunsupervisednewphysicsdetectionat40mhz
AT wozniakkingaanna lhcphysicsdatasetforunsupervisednewphysicsdetectionat40mhz
AT ngadiubajennifer lhcphysicsdatasetforunsupervisednewphysicsdetectionat40mhz