Cargando…

Performance studies of CMS workflows using Big Data technologies

At the Large Hadron Collider (LHC), more than 30 petabytes of data are produced from particle collisions every year of data taking. The data processing requires large volumes of simulated events through Monte Carlo techniques. Furthermore, physics analysis implies daily access to derived data format...

Descripción completa

Detalles Bibliográficos
Autor principal: Ambroz, Luca
Lenguaje:eng
Publicado: 2017
Materias:
Acceso en línea:http://cds.cern.ch/record/2263131
Descripción
Sumario:At the Large Hadron Collider (LHC), more than 30 petabytes of data are produced from particle collisions every year of data taking. The data processing requires large volumes of simulated events through Monte Carlo techniques. Furthermore, physics analysis implies daily access to derived data formats by hundreds of users. The Worldwide LHC Computing Grid (WLCG) - an international collaboration involving personnel and computing centers worldwide - is successfully coping with these challenges, enabling the LHC physics program. With the continuation of LHC data taking and the approval of ambitious projects such as the High-Luminosity LHC, such challenges will reach the edge of current computing capacity and performance. One of the keys to success in the next decades - also under severe financial resource constraints - is to optimize the efficiency in exploiting the computing resources. This thesis focuses on performance studies of CMS workflows, namely centrallyscheduled production activities and unpredictable distributed analysis. The work aims at developing and evaluating tools to improve the understanding of the monitoring data in both production and analysis. For this reason, the work comprises two parts. Firstly, on the distributed analysis side, the development of tools to quickly analyze the logs of previous Grid job submissions can enable a user to tune the next round of submissions and better exploit the computing resources. Secondly, concerning the monitoring of both analysis and production jobs, commercial Big Data technologies can be used to obtain more efficient and flexible monitoring systems. One aspect of such improvement is the possibility to avoid major aggregations at an early stage and to collect much finer granularity monitoring data which can be further processed at a later stage, just upon request. In this thesis, a work towards both directions is presented. Firstly, a lightweight tool to perform rapid studies on distributed analysis performances is presented as a way to enable physics users to smoothly match the job submission tasks to changing conditions of the overall environment. Secondly, a set of performance studies on the CMS Workflow Management and Data Management sectors are performed exploiting a CMS Metrics Service prototype, based on ElasticSearch/Jupyter Notebook/Kibana technologies, that contains high granularity information on CMS production and analysis jobs, exploiting the HTCondor ClassAdds. Chapter 1 provides an overview of the Standard Model. Chapter 2 discusses the LHC accelerator complex and experiments with main focus on CMS. Chapter 3 introduces Computing in High Energy Physics and describes the CMS Computing Model. Chapter 4 presents the development of an original tool for evaluating the performances of local analysis jobs. Chapter 5 describes how data from the CMS Metrics Service can be analyzed to provide insights on the CMS global activities.