_version_ 1780960791272357888
author Sakulin, Hannes
Andre, Jean-Marc
Behrens, Ulf
Branson, James
Brummer, Philipp
Cittolin, Sergio
da Silva Gomes, Diego
Darlea, Georgiana-Lavinia
Deldicque, Christian
Demiragli, Zeynep
Dobson, Marc
Doualot, Nicolas
Erhan, Samim
Fulcher, Jonathan Richard
Gigi, Dominique
Gladki, Maciej
Glege, Frank
Gomez-Ceballos, Guillelmo
Hegeman, Jeroen
Holzner, Andre
Lettrich, Michael
Mecionis, Audrius
Meijers, Frans
Meschi, Emilio
Mommsen, Remigius K.
Morovic, Srecko
O´Dell, Vivian
Orsini, Luciano
Papakrivopoulos, Ioannis
Paus, Christoph
Petrucci, Andrea
Pieri, Marco
Rabady, Dinyar
Racz, Attila
Rapsevicius, Valdas
Reis, Thomas
Schwick, Christoph
Simelevicius, Dainius
Stankevicius, Mantas
Vazquez Velez, Cristina
Vougioukas, Michail
Wernet, Christian
Zejdl, Petr
author_facet Sakulin, Hannes
Andre, Jean-Marc
Behrens, Ulf
Branson, James
Brummer, Philipp
Cittolin, Sergio
da Silva Gomes, Diego
Darlea, Georgiana-Lavinia
Deldicque, Christian
Demiragli, Zeynep
Dobson, Marc
Doualot, Nicolas
Erhan, Samim
Fulcher, Jonathan Richard
Gigi, Dominique
Gladki, Maciej
Glege, Frank
Gomez-Ceballos, Guillelmo
Hegeman, Jeroen
Holzner, Andre
Lettrich, Michael
Mecionis, Audrius
Meijers, Frans
Meschi, Emilio
Mommsen, Remigius K.
Morovic, Srecko
O´Dell, Vivian
Orsini, Luciano
Papakrivopoulos, Ioannis
Paus, Christoph
Petrucci, Andrea
Pieri, Marco
Rabady, Dinyar
Racz, Attila
Rapsevicius, Valdas
Reis, Thomas
Schwick, Christoph
Simelevicius, Dainius
Stankevicius, Mantas
Vazquez Velez, Cristina
Vougioukas, Michail
Wernet, Christian
Zejdl, Petr
author_sort Sakulin, Hannes
collection CERN
description The data acquisition (DAQ) system of the Compact Muon Solenoid (CMS) at CERN reads out the detector at the level-1 trigger accept rate of 100 kHz, assembles events with a bandwidth of 200 GB/s, provides these events to the high level-trigger running on a farm of about 30k cores and records the accepted events. Comprising custom-built and cutting edge commercial hardware and several 1000 instances of software applications, the DAQ system is complex in itself and failures cannot be completely excluded. Moreover, problems in the readout of the detectors, in the first level trigger system or in the high level trigger may provoke anomalous behaviour of the DAQ system which sometimes cannot easily be differentiated from a problem in the DAQ system itself. In order to achieve high data taking efficiency with operators from the entire collaboration and without relying too heavily on the on-call experts, an expert system, the DAQ-Expert, has been developed that can pinpoint the source of most failures and give advice to the shift crew on how to recover in the quickest way. The DAQ-Expert constantly analyzes monitoring data from the DAQ system and the high level trigger by making use of logic modules written in Java that encapsulate the expert knowledge about potential operational problems. The results of the reasoning are presented to the operator in a web-based dashboard, may trigger sound alerts in the control room and are archived for post-mortem analysis - presented in a web-based timeline browser. We present the design of the DAQ-Expert and report on the operational experience since 2017, when it was first put into production.
id cern-2650367
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2018
record_format invenio
spelling cern-26503672022-08-10T12:21:08Zdoi:10.1051/epjconf/201921401015http://cds.cern.ch/record/2650367engSakulin, HannesAndre, Jean-MarcBehrens, UlfBranson, JamesBrummer, PhilippCittolin, Sergioda Silva Gomes, DiegoDarlea, Georgiana-LaviniaDeldicque, ChristianDemiragli, ZeynepDobson, MarcDoualot, NicolasErhan, SamimFulcher, Jonathan RichardGigi, DominiqueGladki, MaciejGlege, FrankGomez-Ceballos, GuillelmoHegeman, JeroenHolzner, AndreLettrich, MichaelMecionis, AudriusMeijers, FransMeschi, EmilioMommsen, Remigius K.Morovic, SreckoO´Dell, VivianOrsini, LucianoPapakrivopoulos, IoannisPaus, ChristophPetrucci, AndreaPieri, MarcoRabady, DinyarRacz, AttilaRapsevicius, ValdasReis, ThomasSchwick, ChristophSimelevicius, DainiusStankevicius, MantasVazquez Velez, CristinaVougioukas, MichailWernet, ChristianZejdl, PetrOperational experience with the new CMS DAQ-ExpertDetectors and Experimental TechniquesThe data acquisition (DAQ) system of the Compact Muon Solenoid (CMS) at CERN reads out the detector at the level-1 trigger accept rate of 100 kHz, assembles events with a bandwidth of 200 GB/s, provides these events to the high level-trigger running on a farm of about 30k cores and records the accepted events. Comprising custom-built and cutting edge commercial hardware and several 1000 instances of software applications, the DAQ system is complex in itself and failures cannot be completely excluded. Moreover, problems in the readout of the detectors, in the first level trigger system or in the high level trigger may provoke anomalous behaviour of the DAQ system which sometimes cannot easily be differentiated from a problem in the DAQ system itself. In order to achieve high data taking efficiency with operators from the entire collaboration and without relying too heavily on the on-call experts, an expert system, the DAQ-Expert, has been developed that can pinpoint the source of most failures and give advice to the shift crew on how to recover in the quickest way. The DAQ-Expert constantly analyzes monitoring data from the DAQ system and the high level trigger by making use of logic modules written in Java that encapsulate the expert knowledge about potential operational problems. The results of the reasoning are presented to the operator in a web-based dashboard, may trigger sound alerts in the control room and are archived for post-mortem analysis - presented in a web-based timeline browser. We present the design of the DAQ-Expert and report on the operational experience since 2017, when it was first put into production.CMS-CR-2018-406oai:cds.cern.ch:26503672018-12-03
spellingShingle Detectors and Experimental Techniques
Sakulin, Hannes
Andre, Jean-Marc
Behrens, Ulf
Branson, James
Brummer, Philipp
Cittolin, Sergio
da Silva Gomes, Diego
Darlea, Georgiana-Lavinia
Deldicque, Christian
Demiragli, Zeynep
Dobson, Marc
Doualot, Nicolas
Erhan, Samim
Fulcher, Jonathan Richard
Gigi, Dominique
Gladki, Maciej
Glege, Frank
Gomez-Ceballos, Guillelmo
Hegeman, Jeroen
Holzner, Andre
Lettrich, Michael
Mecionis, Audrius
Meijers, Frans
Meschi, Emilio
Mommsen, Remigius K.
Morovic, Srecko
O´Dell, Vivian
Orsini, Luciano
Papakrivopoulos, Ioannis
Paus, Christoph
Petrucci, Andrea
Pieri, Marco
Rabady, Dinyar
Racz, Attila
Rapsevicius, Valdas
Reis, Thomas
Schwick, Christoph
Simelevicius, Dainius
Stankevicius, Mantas
Vazquez Velez, Cristina
Vougioukas, Michail
Wernet, Christian
Zejdl, Petr
Operational experience with the new CMS DAQ-Expert
title Operational experience with the new CMS DAQ-Expert
title_full Operational experience with the new CMS DAQ-Expert
title_fullStr Operational experience with the new CMS DAQ-Expert
title_full_unstemmed Operational experience with the new CMS DAQ-Expert
title_short Operational experience with the new CMS DAQ-Expert
title_sort operational experience with the new cms daq-expert
topic Detectors and Experimental Techniques
url https://dx.doi.org/10.1051/epjconf/201921401015
http://cds.cern.ch/record/2650367
work_keys_str_mv AT sakulinhannes operationalexperiencewiththenewcmsdaqexpert
AT andrejeanmarc operationalexperiencewiththenewcmsdaqexpert
AT behrensulf operationalexperiencewiththenewcmsdaqexpert
AT bransonjames operationalexperiencewiththenewcmsdaqexpert
AT brummerphilipp operationalexperiencewiththenewcmsdaqexpert
AT cittolinsergio operationalexperiencewiththenewcmsdaqexpert
AT dasilvagomesdiego operationalexperiencewiththenewcmsdaqexpert
AT darleageorgianalavinia operationalexperiencewiththenewcmsdaqexpert
AT deldicquechristian operationalexperiencewiththenewcmsdaqexpert
AT demiraglizeynep operationalexperiencewiththenewcmsdaqexpert
AT dobsonmarc operationalexperiencewiththenewcmsdaqexpert
AT doualotnicolas operationalexperiencewiththenewcmsdaqexpert
AT erhansamim operationalexperiencewiththenewcmsdaqexpert
AT fulcherjonathanrichard operationalexperiencewiththenewcmsdaqexpert
AT gigidominique operationalexperiencewiththenewcmsdaqexpert
AT gladkimaciej operationalexperiencewiththenewcmsdaqexpert
AT glegefrank operationalexperiencewiththenewcmsdaqexpert
AT gomezceballosguillelmo operationalexperiencewiththenewcmsdaqexpert
AT hegemanjeroen operationalexperiencewiththenewcmsdaqexpert
AT holznerandre operationalexperiencewiththenewcmsdaqexpert
AT lettrichmichael operationalexperiencewiththenewcmsdaqexpert
AT mecionisaudrius operationalexperiencewiththenewcmsdaqexpert
AT meijersfrans operationalexperiencewiththenewcmsdaqexpert
AT meschiemilio operationalexperiencewiththenewcmsdaqexpert
AT mommsenremigiusk operationalexperiencewiththenewcmsdaqexpert
AT morovicsrecko operationalexperiencewiththenewcmsdaqexpert
AT odellvivian operationalexperiencewiththenewcmsdaqexpert
AT orsiniluciano operationalexperiencewiththenewcmsdaqexpert
AT papakrivopoulosioannis operationalexperiencewiththenewcmsdaqexpert
AT pauschristoph operationalexperiencewiththenewcmsdaqexpert
AT petrucciandrea operationalexperiencewiththenewcmsdaqexpert
AT pierimarco operationalexperiencewiththenewcmsdaqexpert
AT rabadydinyar operationalexperiencewiththenewcmsdaqexpert
AT raczattila operationalexperiencewiththenewcmsdaqexpert
AT rapseviciusvaldas operationalexperiencewiththenewcmsdaqexpert
AT reisthomas operationalexperiencewiththenewcmsdaqexpert
AT schwickchristoph operationalexperiencewiththenewcmsdaqexpert
AT simeleviciusdainius operationalexperiencewiththenewcmsdaqexpert
AT stankeviciusmantas operationalexperiencewiththenewcmsdaqexpert
AT vazquezvelezcristina operationalexperiencewiththenewcmsdaqexpert
AT vougioukasmichail operationalexperiencewiththenewcmsdaqexpert
AT wernetchristian operationalexperiencewiththenewcmsdaqexpert
AT zejdlpetr operationalexperiencewiththenewcmsdaqexpert