Cargando…

Statistical Learning and Inference at Particle Collider Experiments

Advances in data analysis techniques may play a decisive role in the discovery reach of particle collider experiments. However, the importing of expertise and methods from other data-centric disciplines such as machine learning and statistics faces significant hurdles, mainly due to the established...

Descripción completa

Detalles Bibliográficos
Autor principal: De Castro Manzano, Pablo
Lenguaje:eng
Publicado: 2019
Materias:
Acceso en línea:http://cds.cern.ch/record/2701341
_version_ 1780964512757710848
author De Castro Manzano, Pablo
author_facet De Castro Manzano, Pablo
author_sort De Castro Manzano, Pablo
collection CERN
description Advances in data analysis techniques may play a decisive role in the discovery reach of particle collider experiments. However, the importing of expertise and methods from other data-centric disciplines such as machine learning and statistics faces significant hurdles, mainly due to the established use of different language and constructs. A large part of this document, also conceived as an introduction to the description of an analysis searching for non-resonant Higgs pair production in data collected by the CMS detector at the Large Hadron Collider (LHC), is therefore devoted to a broad redefinition of the relevant concepts for problems in experimental particle physics. The aim is to better connect these issues with those in other fields of research, so the solutions found can be repurposed. The formal exploration of the properties of the statistical models at particle colliders is useful to highlight the main challenges posed by statistical inference in this context: the multi-dimensional nature of the models, which can be studied only in a generative manner via forward simulation of observations, and the effect of nuisance parameters. The first issue can be tackled with likelihood-free inference methods coupled with the use of low-dimensional summary statistics, which may be constructed either with machine learning techniques or through physically motivated variables (e.g. event reconstruction). The second, i.e. the misspecification of the generative model which is addressed by the inclusion of nuisance parameters, reduces the effectiveness of summary statistics constructed with machine-learning techniques.
id cern-2701341
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2019
record_format invenio
spelling cern-27013412023-04-21T09:30:49Zhttp://cds.cern.ch/record/2701341engDe Castro Manzano, PabloStatistical Learning and Inference at Particle Collider ExperimentsDetectors and Experimental TechniquesAdvances in data analysis techniques may play a decisive role in the discovery reach of particle collider experiments. However, the importing of expertise and methods from other data-centric disciplines such as machine learning and statistics faces significant hurdles, mainly due to the established use of different language and constructs. A large part of this document, also conceived as an introduction to the description of an analysis searching for non-resonant Higgs pair production in data collected by the CMS detector at the Large Hadron Collider (LHC), is therefore devoted to a broad redefinition of the relevant concepts for problems in experimental particle physics. The aim is to better connect these issues with those in other fields of research, so the solutions found can be repurposed. The formal exploration of the properties of the statistical models at particle colliders is useful to highlight the main challenges posed by statistical inference in this context: the multi-dimensional nature of the models, which can be studied only in a generative manner via forward simulation of observations, and the effect of nuisance parameters. The first issue can be tackled with likelihood-free inference methods coupled with the use of low-dimensional summary statistics, which may be constructed either with machine learning techniques or through physically motivated variables (e.g. event reconstruction). The second, i.e. the misspecification of the generative model which is addressed by the inclusion of nuisance parameters, reduces the effectiveness of summary statistics constructed with machine-learning techniques.CMS-TS-2019-026CERN-THESIS-2019-209oai:cds.cern.ch:27013412019
spellingShingle Detectors and Experimental Techniques
De Castro Manzano, Pablo
Statistical Learning and Inference at Particle Collider Experiments
title Statistical Learning and Inference at Particle Collider Experiments
title_full Statistical Learning and Inference at Particle Collider Experiments
title_fullStr Statistical Learning and Inference at Particle Collider Experiments
title_full_unstemmed Statistical Learning and Inference at Particle Collider Experiments
title_short Statistical Learning and Inference at Particle Collider Experiments
title_sort statistical learning and inference at particle collider experiments
topic Detectors and Experimental Techniques
url http://cds.cern.ch/record/2701341
work_keys_str_mv AT decastromanzanopablo statisticallearningandinferenceatparticlecolliderexperiments