Cargando…
Statistical Learning and Inference at Particle Collider Experiments
Advances in data analysis techniques may play a decisive role in the discovery reach of particle collider experiments. However, the importing of expertise and methods from other data-centric disciplines such as machine learning and statistics faces significant hurdles, mainly due to the established...
Autor principal: | |
---|---|
Lenguaje: | eng |
Publicado: |
2019
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2701341 |
_version_ | 1780964512757710848 |
---|---|
author | De Castro Manzano, Pablo |
author_facet | De Castro Manzano, Pablo |
author_sort | De Castro Manzano, Pablo |
collection | CERN |
description | Advances in data analysis techniques may play a decisive role in the discovery reach of particle collider experiments. However, the importing of expertise and methods from other data-centric disciplines such as machine learning and statistics faces significant hurdles, mainly due to the established use of different language and constructs. A large part of this document, also conceived as an introduction to the description of an analysis searching for non-resonant Higgs pair production in data collected by the CMS detector at the Large Hadron Collider (LHC), is therefore devoted to a broad redefinition of the relevant concepts for problems in experimental particle physics. The aim is to better connect these issues with those in other fields of research, so the solutions found can be repurposed. The formal exploration of the properties of the statistical models at particle colliders is useful to highlight the main challenges posed by statistical inference in this context: the multi-dimensional nature of the models, which can be studied only in a generative manner via forward simulation of observations, and the effect of nuisance parameters. The first issue can be tackled with likelihood-free inference methods coupled with the use of low-dimensional summary statistics, which may be constructed either with machine learning techniques or through physically motivated variables (e.g. event reconstruction). The second, i.e. the misspecification of the generative model which is addressed by the inclusion of nuisance parameters, reduces the effectiveness of summary statistics constructed with machine-learning techniques. |
id | cern-2701341 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2019 |
record_format | invenio |
spelling | cern-27013412023-04-21T09:30:49Zhttp://cds.cern.ch/record/2701341engDe Castro Manzano, PabloStatistical Learning and Inference at Particle Collider ExperimentsDetectors and Experimental TechniquesAdvances in data analysis techniques may play a decisive role in the discovery reach of particle collider experiments. However, the importing of expertise and methods from other data-centric disciplines such as machine learning and statistics faces significant hurdles, mainly due to the established use of different language and constructs. A large part of this document, also conceived as an introduction to the description of an analysis searching for non-resonant Higgs pair production in data collected by the CMS detector at the Large Hadron Collider (LHC), is therefore devoted to a broad redefinition of the relevant concepts for problems in experimental particle physics. The aim is to better connect these issues with those in other fields of research, so the solutions found can be repurposed. The formal exploration of the properties of the statistical models at particle colliders is useful to highlight the main challenges posed by statistical inference in this context: the multi-dimensional nature of the models, which can be studied only in a generative manner via forward simulation of observations, and the effect of nuisance parameters. The first issue can be tackled with likelihood-free inference methods coupled with the use of low-dimensional summary statistics, which may be constructed either with machine learning techniques or through physically motivated variables (e.g. event reconstruction). The second, i.e. the misspecification of the generative model which is addressed by the inclusion of nuisance parameters, reduces the effectiveness of summary statistics constructed with machine-learning techniques.CMS-TS-2019-026CERN-THESIS-2019-209oai:cds.cern.ch:27013412019 |
spellingShingle | Detectors and Experimental Techniques De Castro Manzano, Pablo Statistical Learning and Inference at Particle Collider Experiments |
title | Statistical Learning and Inference at Particle Collider Experiments |
title_full | Statistical Learning and Inference at Particle Collider Experiments |
title_fullStr | Statistical Learning and Inference at Particle Collider Experiments |
title_full_unstemmed | Statistical Learning and Inference at Particle Collider Experiments |
title_short | Statistical Learning and Inference at Particle Collider Experiments |
title_sort | statistical learning and inference at particle collider experiments |
topic | Detectors and Experimental Techniques |
url | http://cds.cern.ch/record/2701341 |
work_keys_str_mv | AT decastromanzanopablo statisticallearningandinferenceatparticlecolliderexperiments |