Cargando…
Techniques for parametric simulation with deep neural networks and implementation for the LHCb experiment at CERN and its future upgrades
The present knowledge of elementary particles and their interactions is collected within a successfully theory named Standard Model, which continues to predict the majority of the experimental results obtained to date. However, despite its clear success, the Standard Model is not a complete theory b...
Autor principal: | |
---|---|
Lenguaje: | eng |
Publicado: |
2022
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2826210 |
Sumario: | The present knowledge of elementary particles and their interactions is collected within a successfully theory named Standard Model, which continues to predict the majority of the experimental results obtained to date. However, despite its clear success, the Standard Model is not a complete theory because of the existence of several questions still unanswered: dark matter, neutrino masses or abundance of matter over antimatter are just few examples. It is therefore necessary to keep testing the theory to search evidences of phenomena beyond the Standard Model which can eventually hint the path to follow for New Physics. To this end, being able to increase the currently reached precision by the particle experiments and to provide significant terms of comparison with the theoretical expectation is crucial. In this scenario the Large Hadron Collider (LHC), the world’s largest and most powerful particle accelerator, plays a key role, facing the technological challenges to move forward in understanding the Universe and its laws. The LHCb experiment is one of the four detectors along the accelerator ring and is dedicated to the study of heavy flavour physics at the LHC. Its primary goal is to look for indirect evidence of New Physics in CP-violation and in rare decays of b- and c-hadrons. Due to its field of interest, the LHCb detector has a peculiar geometry totally different from the one of the other LHC experiments, designed to maximize the angular acceptance of high-energy $b\bar{b}$ pairs. Moreover, in order to complete its physics program, LHCb is able to reconstruct with extreme precision the decay vertices, to measure accurately track momenta and to provide a robust particle identification system. All these features ensure to identify heavy hadron decays with high efficiency, and to measure rare processes exploiting the flexibility of a trigger system capable to cope effectively to the harsh environment produced by a hadronic collider. To improve its capabilities in search of New Physics, the LHCb is currently being upgraded, exploiting the stop of data taking for the Long Shutdown 2 (2018-2021). The LHCb Upgrade detector will operate through LHC Runs 3 and 4, and will be able to take full advantage from the increasing of instantaneous luminosity by a factor five and from a purely-software trigger able to improve the selection efficiency by at least a factor two. This will result in data samples roughly increased by an order of magnitude, and in the possibility of reaching unprecedented physics accuracy. However, the huge datasets alone are not enough to achieve the physics goal, which requires accurate studies of the background contributions on the data, and of the selection efficiency of the trigger and the particle identification systems. To this end, it is necessary to produce simulation samples that has to be at least of the same size as the collected data. Consequently, making the most of the upgraded experiment requires of improving significantly the simulation production. The full software trigger designed for the LHCb Upgrade detector will allow the simulation to consume almost the totality of the computing resources available to the experiment. Nevertheless, the traditional simulation system is already now incapable to sustain the analysis needs of the physics groups, and this kind of scenario will only get worse. The traditional simulation can be split into two main independent phases. The first phase consists of the event generation from proton-proton collisions to the decay of particles in the channels of interest for the physics program. Instead, the second phase consists in the tracking through the LHCb detector of the particles produced by the generator phase, simulating all the radiation-matter interactions occurring within the detector. Due to the full reproduction of these interactions, the second phase determines high CPU-consuming computations which are the real bottleneck, in terms of time performance, of the entire simulation process. This strategy is named full simulation. The full simulation has already saturated the computing resources available and is not able to provide all analyses with the simulated samples necessary. As a consequence, the analyses that need large samples have uncertainty, in many cases already dominant, due to the statistical uncertainity on the simulated sampled. In order to avoid that this scenario can occur also for the LHCb Upgrade detector, faster simulation options must be adopted. Renouncing to simulate the radiation-matter interactions one can obtain the maximum speedup of the simulation system. Similar strategies are called ultra-fast simulation and consist of reproducing the high-level response of the detectors, mapping it with parametric or non-parametric functions. Among non-parametric solutions, methods based on Machine Learning algorithms and, in particular, based on Generative Adversarial Networks (GAN) have proved to be very promising. The GAN systems are a powerful class of generative models based on simultaneous training of two neural networks. The first neural network, named generator G, outputs synthetic data trying to reproduce the probability distributions of the elements contained in a reference sample. Instead, the second neural network, named discriminator D, given elements both from the reference sample and the generated one as inputs, outputs the probability that the input belongs to the reference sample with respect to the generated one. The goal is that the discriminator distinguishes the origin of the two samples, and simultaneously the training procedure for the generator is to hinder the discriminator task. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with the generator recovering the reference data distribution and the discriminator output equal to 1/2 everywhere. Throughout recent years, GAN systems have demonstrated to be a backbone for Computer Vision, proving their capacity in reproducing highly faithful and diverse probability distributions thanks to models learned directly from data. Therefore, it is not surprising that models trained thanks of the minimax procedure are able to reproduce accurately the detector responses and, in particular, to map basic information about track kinematics and about event multiplicity into the correctly distributions of different high-level particle identification variables. The idea is to exploit the competition between the generator and discriminator in order to update their respective parameters (training process), and to adopt the first neural network as a generative model capable to parameterize the responses of the particle identification system of LHCb. Similar models can then be implemented within the simulation framework of the LHCb experiment providing a powerful ultrafast and fairly accurate solution to produce the huge simulated samples necessary for the analyses demands. A large part of my thesis work has covered the development and implementation of generative models for the particle identification system of LHCb based on GANs. Despite GANs have proved to have remarkable capabilities in learning probability distributions directly from data, their training is often hard and being able to reproduce the entire reference space is not always guaranteed. In order to maximize the correct convergence of the generative models trained, I have explored more deeply the theory behind adversarial systems implementing state-of-the-art algorithms capable to accomplish effectively the target set. In order to tune and select the best model for each subdetector of the particle identification system, I have developed a statistical method based on robust multivariate classifiers to test the quality of the generated samples. This statistical method has been used to choose the best strategies for data preparation in order to complete successfully the training process, and to assess the quality reached by different learning algorithms proposed in the literature. The built models are able to output good-looking distributions for the high-level particle identification variables of LHCb, allowing to reproduce a large amount of diverse function shapes starting from track kinematics parameters and a measure of the detector occupancy. The trained models can be exploited within a simulation framework in order to offer an efficient ultra-fast solution to produce large simulated samples. Then, during my thesis work I have participated to the development of a new simulation framework designed to take full advantage of the most modern software for evaluating large computational graphs, such as complex neural networks. In particular, the integration of the trained models within this simulation framework consists of a personal contribution. The framework developed has proved to be able to produce faithful simulated samples starting from the generation phase, and to provide detector parameterization competitive with what obtained from the full simulation. Chapter 1 reports an overview about the Standard Model and an introduction about reasons behind the necessity of producing large simulated samples, taking as an examples the needs of the LHCb experiment, presented in detail in Section 1.2. In Chapter 2, the limitations of a full simulation approach are treated and some faster solutions are discussed. Chapter 3 is deeply dedicated to the Machine Learning paradigm, giving particular attention to generative models. In particular, the theory behind the Generative Adversarial Networks and their training problems are discussed in Section 3.3. Chapter 4 reports details about the implementation of GAN models for the particle identification system of LHCb. The statistical method mentioned above is described in Section 4.2.2, while Section 4.4 shows the resulting distributions obtained at the end of the training process. Finally, Chapter 5 discusses this new simulation framework, paying particularly attention to its technical implementation: its impressive results are reported in Section 5.4. |
---|