Cargando…

Optimising HEP parameter fits via Monte Carlo weight derivative regression

HEP event selection is traditionally considered a binary classification problem, involving the dichotomous categories of signal and background. In distribution fits for particle masses or couplings, however, signal events are not all equivalent, as the signal differential cross section has different...

Descripción completa

Detalles Bibliográficos
Autor principal:	Valassi, Andrea
Lenguaje:	eng
Publicado:	2020
Materias:	hep-ex Particle Physics - Experiment cs.LG Computing and Computers physics.data-an Other Fields of Physics
Acceso en línea:	https://dx.doi.org/10.1051/epjconf/202024506038 http://cds.cern.ch/record/2715330

_version_	1780965427308920832
author	Valassi, Andrea
author_facet	Valassi, Andrea
author_sort	Valassi, Andrea
collection	CERN
description	HEP event selection is traditionally considered a binary classification problem, involving the dichotomous categories of signal and background. In distribution fits for particle masses or couplings, however, signal events are not all equivalent, as the signal differential cross section has different sensitivities to the measured parameter in different regions of phase space. In this paper, I describe a mathematical framework for the evaluation and optimization of HEP parameter fits, where this sensitivity is defined on an event-by-event basis, and for MC events it is modeled in terms of their MC weight derivatives with respect to the measured parameter. Minimising the statistical error on a measurement implies the need to resolve (i.e. separate) events with different sensitivities, which ultimately represents a non-dichotomous classification problem. Since MC weight derivatives are not available for real data, the practical strategy I suggest consists in training a regressor of weight derivatives against MC events, and then using it as an optimal partitioning variable for 1-dimensional fits of data events. This CHEP2019 paper is an extension of the study presented at CHEP2018: in particular, event-by-event sensitivities allow the exact computation of the “FIP” ratio between the Fisher information obtained from an analysis and the maximum information that could possibly be obtained with an ideal detector. Using this expression, I discuss the relationship between FIP and two metrics commonly used in Meteorology (Brier score and MSE), and the importance of “sharpness” both in HEP and in that domain. I finally point out that HEP distribution fits should be optimized and evaluated using probabilistic metrics (like FIP or MSE), whereas ranking metrics (like AUC) or threshold metrics (like accuracy) are of limited relevance for these specific problems.
id	cern-2715330
institution	Organización Europea para la Investigación Nuclear
language	eng
publishDate	2020
record_format	invenio
spelling	cern-27153302023-03-12T05:02:24Zdoi:10.1051/epjconf/202024506038http://cds.cern.ch/record/2715330engValassi, AndreaOptimising HEP parameter fits via Monte Carlo weight derivative regressionhep-exParticle Physics - Experimentcs.LGComputing and Computersphysics.data-anOther Fields of PhysicsHEP event selection is traditionally considered a binary classification problem, involving the dichotomous categories of signal and background. In distribution fits for particle masses or couplings, however, signal events are not all equivalent, as the signal differential cross section has different sensitivities to the measured parameter in different regions of phase space. In this paper, I describe a mathematical framework for the evaluation and optimization of HEP parameter fits, where this sensitivity is defined on an event-by-event basis, and for MC events it is modeled in terms of their MC weight derivatives with respect to the measured parameter. Minimising the statistical error on a measurement implies the need to resolve (i.e. separate) events with different sensitivities, which ultimately represents a non-dichotomous classification problem. Since MC weight derivatives are not available for real data, the practical strategy I suggest consists in training a regressor of weight derivatives against MC events, and then using it as an optimal partitioning variable for 1-dimensional fits of data events. This CHEP2019 paper is an extension of the study presented at CHEP2018: in particular, event-by-event sensitivities allow the exact computation of the “FIP” ratio between the Fisher information obtained from an analysis and the maximum information that could possibly be obtained with an ideal detector. Using this expression, I discuss the relationship between FIP and two metrics commonly used in Meteorology (Brier score and MSE), and the importance of “sharpness” both in HEP and in that domain. I finally point out that HEP distribution fits should be optimized and evaluated using probabilistic metrics (like FIP or MSE), whereas ranking metrics (like AUC) or threshold metrics (like accuracy) are of limited relevance for these specific problems.HEP event selection is traditionally considered a binary classification problem, involving the dichotomous categories of signal and background. In distribution fits for particle masses or couplings, however, signal events are not all equivalent, as the signal differential cross section has different sensitivities to the measured parameter in different regions of phase space. In this paper, I describe a mathematical framework for the evaluation and optimization of HEP parameter fits, where this sensitivity is defined on an event-by-event basis, and for MC events it is modeled in terms of their MC weight derivatives with respect to the measured parameter. Minimising the statistical error on a measurement implies the need to resolve (i.e. separate) events with different sensitivities, which ultimately represents a non-dichotomous classification problem. Since MC weight derivatives are not available for real data, the practical strategy I suggest consists in training a regressor of weight derivatives against MC events, and then using it as an optimal partitioning variable for 1-dimensional fits of data events. This CHEP2019 paper is an extension of the study presented at CHEP2018: in particular, event-by-event sensitivities allow the exact computation of the "FIP" ratio between the Fisher information obtained from an analysis and the maximum information that could possibly be obtained with an ideal detector. Using this expression, I discuss the relationship between FIP and two metrics commonly used in Meteorology (Brier score and MSE), and the importance of "sharpness" both in HEP and in that domain. I finally point out that HEP distribution fits should be optimized and evaluated using probabilistic metrics (like FIP or MSE), whereas ranking metrics (like AUC) or threshold metrics (like accuracy) are of limited relevance for these specific problems.arXiv:2003.12853oai:cds.cern.ch:27153302020
spellingShingle	hep-ex Particle Physics - Experiment cs.LG Computing and Computers physics.data-an Other Fields of Physics Valassi, Andrea Optimising HEP parameter fits via Monte Carlo weight derivative regression
title	Optimising HEP parameter fits via Monte Carlo weight derivative regression
title_full	Optimising HEP parameter fits via Monte Carlo weight derivative regression
title_fullStr	Optimising HEP parameter fits via Monte Carlo weight derivative regression
title_full_unstemmed	Optimising HEP parameter fits via Monte Carlo weight derivative regression
title_short	Optimising HEP parameter fits via Monte Carlo weight derivative regression
title_sort	optimising hep parameter fits via monte carlo weight derivative regression
topic	hep-ex Particle Physics - Experiment cs.LG Computing and Computers physics.data-an Other Fields of Physics
url	https://dx.doi.org/10.1051/epjconf/202024506038 http://cds.cern.ch/record/2715330
work_keys_str_mv	AT valassiandrea optimisinghepparameterfitsviamontecarloweightderivativeregression

Optimising HEP parameter fits via Monte Carlo weight derivative regression

Ejemplares similares