Cargando…

The CMS Trigger Supervisor: Control and Hardware Monitoring System of the CMS Level-1 Trigger at CERN

The experiments CMS (Compact Muon Solenoid) and ATLAS (A Toroidal LHC ApparatuS) at the LargeHadron Collider (LHC) are the greatest exponents of the rising complexity in High Energy Physics (HEP) datahandling instrumentation. Tens of millions of readout channels, tens of thousands of hardware boards...

Descripción completa

Detalles Bibliográficos
Autor principal: Ildefons Magrans de Abril
Lenguaje:eng
Publicado: 2008
Materias:
Acceso en línea:http://cds.cern.ch/record/1446282
_version_ 1780924788430077952
author Ildefons Magrans de Abril
author_facet Ildefons Magrans de Abril
author_sort Ildefons Magrans de Abril
collection CERN
description The experiments CMS (Compact Muon Solenoid) and ATLAS (A Toroidal LHC ApparatuS) at the LargeHadron Collider (LHC) are the greatest exponents of the rising complexity in High Energy Physics (HEP) datahandling instrumentation. Tens of millions of readout channels, tens of thousands of hardware boards and thesame order of connections are figures of merit. However, the hardware volume is not the only complexitydimension, the unprecedented large number of research institutes and scientists that form the internationalcollaborations, and the long design, development, commissioning and operational phases are additional factorsthat must be taken into account.The Level-1 (L1) trigger decision loop is an excellent example of these difficulties. This system is based on apipelined logic destined to analyze without deadtime the data from each LHC bunch crossing occurring every25_ns, using special coarsely segmented trigger data from the detectors. The L1 trigger is responsible forreducing the rate of accepted crossings to below 100 kHz. While the L1 trigger is taking its decision the fullhigh-precision data of all detector channels are stored in the detector front-end buffers, which are only read out ifthe event is accepted. The Level-1 Accept (L1A) decision is communicated to the sub-detectors through theTiming, Trigger and Control (TTC) system. The L1 decision loop hardware system was built by more than tenresearch institutes with a development and construction period of nearly ten years, featuring more than fiftyVME crates, and thousands of boards and connections.In this context, it is mandatory to provide software tools that ease integration and the short, medium and longterm operation of the experiment. This research work proposes solutions, based on web services technologies, tosimplify the implementation and operation of software control systems to manage hardware devices for HEPexperiments. The main contribution of this work is the design and development of a hardware managementsystem intended to enable the operation and integration of the L1 decision loop of the CMS experiment (CMSTrigger Supervisor, TS).The TS conceptual design proposes a hierarchical distributed system which fits the web services based model ofthe CMS Online SoftWare Infrastructure (OSWI) well. The functional scope of this system covers theconfiguration, testing and monitoring of the L1 decision loop hardware, and its interaction with the overall CMSexperiment control system and the rest of the experiment. Together with the technical design aspects, the projectorganization strategy is discussed.The main topic follows an initial investigation about the usage of the eXtended Markup Language (XML) asuniform data representation format for a software environment to implement hardware management systems forHEP experiments. This model extends the usage of XML beyond the boundaries of the control and monitoringrelated data and proposes its usage also for the code. This effort, carried out in the context of the CMS Triggerand Data Acquisition project, improved the overall team knowledge on XML technologies, created a pool ofideas and helped to anticipate the main TS requirements and architectural concepts.VISUAL SUMMARY ............................................................................................................................................... IICONTENTS.........................................................................................................................................................IIIACRONYMS ..................................................................................................................................................... VIICHAPTER 12) An interpreted, run-time extensible, high-level control language for these sequences that providesindependence from specific hosts and interconnect systems to which devices are attached.This model, as compared to other approaches [40], enforces the uniform use of XML syntax to describeconfiguration data, device specifications, and control sequences for configuration and control of hardwaredevices. This means that control sequences can be treated as data, making it easy to write scripts that manipulateother scripts and embed them into other XML documents. In addition, the unified model makes it possible to usethe same concepts, tools, and persistency mechanisms, which simplifies the software configuration managementof large projects7.turn is a descendant of the xdaq::Application class. The fact that a sub-system cell is a XDAQ applicationallows the sub-system cell to be added to a XDAQ partition, then making it browsable through the XDAQHTTP/CGI interface. The XDAQ SOAP Remote Procedure Calls (RPC’s) interface is also available to the subsystem cell. The RPC interface, implemented in the CellAbstract class, allows a remote usage of the celloperations and commands. The CellAbstract class is also responsible for the dynamic creation ofcommunication channels between the cell and external services also known as “xhannels”. The xhannel run-timesetup is done according to a XML file known as “xhannel list”. The CellAbstract class implements a GUIaccessible through the XDAQ HTTP/CGI interface which can be extended with custom graphical setups called“control panels”.among all instances of CellObject in a given cell, in particular for all CellCommand and CellOperationinstances. The CellAbstractContext provides access to the factories and to the xhannels. Through a dynamic-xhannels-factoriescontext). In some cases, the sub-system cell context gives access to a sub-system hardware driver. Therefore, allCellCommand and CellOperation instances can control the hardware. The CellObject interface facilitates alsoaccess to the logging infrastructure through the logger object. Each CellCommand or CellOperation object has aCellWarning object.The CellCommand has one public method named run(). When this method is called, a sequence of three virtualmethods is executed. These virtual methods have to be implemented in the specific CellSubsystemCommandclass: 1) the Init() method initializes those objects that will be used in the precondition() and code()methods (Section 4.3.2); 2) the precondition() method checks the necessary conditions to execute thecommand; and 3) the code() method defines the functionality of the command. The warning message and levelcan be read or written within any of these methods. Finally, the run() method returns the reply SOAP messagewhich embeds a serialized version in XML of the code() method result and warning objects.method.1CellAbstractContextCellPannel+addCommand()+addOperation()+addChannel()default() virtual method of the xdaq::Application class. This method parses the input HTTP/CGI requestwhich is available as a Cgicc input argument (Section 4.4.2.6). The HTTP/CGI response is written into theCgicc output argument at the end of the default() method and is sent back by the executive to the browser. TheTS GUI is presented in Section 4.4.4.11.+addCommand()+addOperation()+addChannel()+xoap::MessageReference guiResponse(xoap::MessageReference msg)()+xoap::MessageReference command(xoap::MessageReference msg)()+void Default(xgi::Input* in, xgi::Output* out)()«uses»SubsystemCellAjaxellis added using the CellAbstract::addCommand() method.All SOAP commands are served by the same callback method CellAbstract::command(). This method uses theCommandFactory object to create a CellCommand object and executes the command public methodCellCommand::run() (Section 4.4.4.2). The SOAP message object returned by the run() method is forwardedby the executive to the controller. Section 4.4.4.6 discusses in more detail the implementation of the synchronousand asynchronous interaction with the controller and the Appendix A presents the SOAP API from the controllerpoint of view.+addCommand()+addOperation()+addChannel()object and then executes the method CellCommand::run() which returns the SOAP reply message (Section4.4.4.2). In the synchronous case, the CellCommand::run() method returns just after executing the code()method. In the asynchronous case, the CellCommand::run() method returns immediately after starting theexecution of the code() method which continues running in a dedicated thread. The asynchronous SOAP replymessage is sent back to the controller by this thread when the code() method finishes. The thread is facilitatedby the cell command inheritance from the toolbox::lang::class class. Figure 4-12 shows a simplifiedsequence diagram of the interaction between a controller and a cell using synchronous and asynchronous SOAPmessage protocols.method builds the reply message with the warning level equal to 3000 (Appendix A) and the warning messagespecifying the software exception. When the command or operation transition method is executed after anasynchronous request, all possible exceptions are caught in the same thread where the code() methods runs. Inthis second case, the thread itself builds the reply message with the adequate warning information.In case the cell dies during the execution of a given synchronous request, this will be detected on the client sidebecause the socket connection between the client and cell would be broken. If the request is sent in asynchronousmode, the request message is sent through a socket which is closed just after receiving the acknowledgemessage. In this case, the reply message is sent through a second socket opened by the cell. Therefore, the clientis not automatically informed if the cell dies, and it is the client’s responsibility to implement a time-out or aperiodic “ping” routine to check that the cell is still alive.library. This is an out-of-the-box solution which does not require any additional development by the subsystems. Figure 4-16 shows the TS GUI. It provides several controls: i) to execute cell commands; ii) toinitialize, operate, and kill cell operations; iii) to visualize monitoring information retrieved from a monitorcollector; iv) to access to the logging record for audit trials and postmortem analysis; v) to populate the L1trigger configuration database; vi) to request support; and vii) to download documentation.The cell web interface fulfills the requirement of automating the generation of a graphical user interface (Section4.2.2). The default TS GUI can be extended with “control panels”. A control panel is a sub-system specificgraphical setup, normally intended for expert operations of the sub-system hardware. The control panelinfrastructure allows developing expert tools with the TS framework. This possibility opens the door for themigration of existing standalone tools (Section 1.4.4) to control panels, and therefore contributes to theharmonization of the underlying technologies for both the expert tools and the TS. This homogeneoustechnological approach has the following benefits: i) smoothing the learning curve of the operators, ii)simplification of the overall L1 trigger OSWI maintenance, and iii) enhancing the sharing of code andexperience.The implementation of a sub-system control panel is equivalent to develop a SubsystemPanel class whichinherits from the CellPanel class (Figure 4-6). This development consists of defining theSubsystemPanel::layout() method following the guidelines of the TS framework user’s guide and using thewidgets of the Ajaxell library [90]. The example of the Global Trigger control panel is presented in Section6.5.1.11Prepare cell context: The cell context, presented in Section 4.4.4.2, is a shared object among allCellObject objects that forms a given cell. The CellAbstractContext object contains the Logger, thexhannels and the factories. The cell context can be extended in order to store sub-system specific sharedobjects like a hardware driver. To extend the cell context it is necessary to define a class descendant ofCellAbstractContext (e.g. SubsystemContext in Figure 4-6). The cell context object has to be created inthe cell constructor and assigned to the context_ attribute. The cell context attribute can be accessed fromany CellObject object, for instance a cell command or operation.Prepare xhannel list file: The preparation of the xhannel list consists of defining the external web serviceproviders that will be used by the cell: other cells, Tstore application to access the configuration database orany other XDAQ application (Section 4.4.4.9). Once the cell is running, the xhannels are accessible throughthe cell context object.
id cern-1446282
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2008
record_format invenio
spelling cern-14462822019-09-30T06:29:59Zhttp://cds.cern.ch/record/1446282engIldefons Magrans de AbrilThe CMS Trigger Supervisor: Control and Hardware Monitoring System of the CMS Level-1 Trigger at CERNDetectors and Experimental TechniquesComputing and ComputersThe experiments CMS (Compact Muon Solenoid) and ATLAS (A Toroidal LHC ApparatuS) at the LargeHadron Collider (LHC) are the greatest exponents of the rising complexity in High Energy Physics (HEP) datahandling instrumentation. Tens of millions of readout channels, tens of thousands of hardware boards and thesame order of connections are figures of merit. However, the hardware volume is not the only complexitydimension, the unprecedented large number of research institutes and scientists that form the internationalcollaborations, and the long design, development, commissioning and operational phases are additional factorsthat must be taken into account.The Level-1 (L1) trigger decision loop is an excellent example of these difficulties. This system is based on apipelined logic destined to analyze without deadtime the data from each LHC bunch crossing occurring every25_ns, using special coarsely segmented trigger data from the detectors. The L1 trigger is responsible forreducing the rate of accepted crossings to below 100 kHz. While the L1 trigger is taking its decision the fullhigh-precision data of all detector channels are stored in the detector front-end buffers, which are only read out ifthe event is accepted. The Level-1 Accept (L1A) decision is communicated to the sub-detectors through theTiming, Trigger and Control (TTC) system. The L1 decision loop hardware system was built by more than tenresearch institutes with a development and construction period of nearly ten years, featuring more than fiftyVME crates, and thousands of boards and connections.In this context, it is mandatory to provide software tools that ease integration and the short, medium and longterm operation of the experiment. This research work proposes solutions, based on web services technologies, tosimplify the implementation and operation of software control systems to manage hardware devices for HEPexperiments. The main contribution of this work is the design and development of a hardware managementsystem intended to enable the operation and integration of the L1 decision loop of the CMS experiment (CMSTrigger Supervisor, TS).The TS conceptual design proposes a hierarchical distributed system which fits the web services based model ofthe CMS Online SoftWare Infrastructure (OSWI) well. The functional scope of this system covers theconfiguration, testing and monitoring of the L1 decision loop hardware, and its interaction with the overall CMSexperiment control system and the rest of the experiment. Together with the technical design aspects, the projectorganization strategy is discussed.The main topic follows an initial investigation about the usage of the eXtended Markup Language (XML) asuniform data representation format for a software environment to implement hardware management systems forHEP experiments. This model extends the usage of XML beyond the boundaries of the control and monitoringrelated data and proposes its usage also for the code. This effort, carried out in the context of the CMS Triggerand Data Acquisition project, improved the overall team knowledge on XML technologies, created a pool ofideas and helped to anticipate the main TS requirements and architectural concepts.VISUAL SUMMARY ............................................................................................................................................... IICONTENTS.........................................................................................................................................................IIIACRONYMS ..................................................................................................................................................... VIICHAPTER 12) An interpreted, run-time extensible, high-level control language for these sequences that providesindependence from specific hosts and interconnect systems to which devices are attached.This model, as compared to other approaches [40], enforces the uniform use of XML syntax to describeconfiguration data, device specifications, and control sequences for configuration and control of hardwaredevices. This means that control sequences can be treated as data, making it easy to write scripts that manipulateother scripts and embed them into other XML documents. In addition, the unified model makes it possible to usethe same concepts, tools, and persistency mechanisms, which simplifies the software configuration managementof large projects7.turn is a descendant of the xdaq::Application class. The fact that a sub-system cell is a XDAQ applicationallows the sub-system cell to be added to a XDAQ partition, then making it browsable through the XDAQHTTP/CGI interface. The XDAQ SOAP Remote Procedure Calls (RPC’s) interface is also available to the subsystem cell. The RPC interface, implemented in the CellAbstract class, allows a remote usage of the celloperations and commands. The CellAbstract class is also responsible for the dynamic creation ofcommunication channels between the cell and external services also known as “xhannels”. The xhannel run-timesetup is done according to a XML file known as “xhannel list”. The CellAbstract class implements a GUIaccessible through the XDAQ HTTP/CGI interface which can be extended with custom graphical setups called“control panels”.among all instances of CellObject in a given cell, in particular for all CellCommand and CellOperationinstances. The CellAbstractContext provides access to the factories and to the xhannels. Through a dynamic-xhannels-factoriescontext). In some cases, the sub-system cell context gives access to a sub-system hardware driver. Therefore, allCellCommand and CellOperation instances can control the hardware. The CellObject interface facilitates alsoaccess to the logging infrastructure through the logger object. Each CellCommand or CellOperation object has aCellWarning object.The CellCommand has one public method named run(). When this method is called, a sequence of three virtualmethods is executed. These virtual methods have to be implemented in the specific CellSubsystemCommandclass: 1) the Init() method initializes those objects that will be used in the precondition() and code()methods (Section 4.3.2); 2) the precondition() method checks the necessary conditions to execute thecommand; and 3) the code() method defines the functionality of the command. The warning message and levelcan be read or written within any of these methods. Finally, the run() method returns the reply SOAP messagewhich embeds a serialized version in XML of the code() method result and warning objects.method.1CellAbstractContextCellPannel+addCommand()+addOperation()+addChannel()default() virtual method of the xdaq::Application class. This method parses the input HTTP/CGI requestwhich is available as a Cgicc input argument (Section 4.4.2.6). The HTTP/CGI response is written into theCgicc output argument at the end of the default() method and is sent back by the executive to the browser. TheTS GUI is presented in Section 4.4.4.11.+addCommand()+addOperation()+addChannel()+xoap::MessageReference guiResponse(xoap::MessageReference msg)()+xoap::MessageReference command(xoap::MessageReference msg)()+void Default(xgi::Input* in, xgi::Output* out)()«uses»SubsystemCellAjaxellis added using the CellAbstract::addCommand() method.All SOAP commands are served by the same callback method CellAbstract::command(). This method uses theCommandFactory object to create a CellCommand object and executes the command public methodCellCommand::run() (Section 4.4.4.2). The SOAP message object returned by the run() method is forwardedby the executive to the controller. Section 4.4.4.6 discusses in more detail the implementation of the synchronousand asynchronous interaction with the controller and the Appendix A presents the SOAP API from the controllerpoint of view.+addCommand()+addOperation()+addChannel()object and then executes the method CellCommand::run() which returns the SOAP reply message (Section4.4.4.2). In the synchronous case, the CellCommand::run() method returns just after executing the code()method. In the asynchronous case, the CellCommand::run() method returns immediately after starting theexecution of the code() method which continues running in a dedicated thread. The asynchronous SOAP replymessage is sent back to the controller by this thread when the code() method finishes. The thread is facilitatedby the cell command inheritance from the toolbox::lang::class class. Figure 4-12 shows a simplifiedsequence diagram of the interaction between a controller and a cell using synchronous and asynchronous SOAPmessage protocols.method builds the reply message with the warning level equal to 3000 (Appendix A) and the warning messagespecifying the software exception. When the command or operation transition method is executed after anasynchronous request, all possible exceptions are caught in the same thread where the code() methods runs. Inthis second case, the thread itself builds the reply message with the adequate warning information.In case the cell dies during the execution of a given synchronous request, this will be detected on the client sidebecause the socket connection between the client and cell would be broken. If the request is sent in asynchronousmode, the request message is sent through a socket which is closed just after receiving the acknowledgemessage. In this case, the reply message is sent through a second socket opened by the cell. Therefore, the clientis not automatically informed if the cell dies, and it is the client’s responsibility to implement a time-out or aperiodic “ping” routine to check that the cell is still alive.library. This is an out-of-the-box solution which does not require any additional development by the subsystems. Figure 4-16 shows the TS GUI. It provides several controls: i) to execute cell commands; ii) toinitialize, operate, and kill cell operations; iii) to visualize monitoring information retrieved from a monitorcollector; iv) to access to the logging record for audit trials and postmortem analysis; v) to populate the L1trigger configuration database; vi) to request support; and vii) to download documentation.The cell web interface fulfills the requirement of automating the generation of a graphical user interface (Section4.2.2). The default TS GUI can be extended with “control panels”. A control panel is a sub-system specificgraphical setup, normally intended for expert operations of the sub-system hardware. The control panelinfrastructure allows developing expert tools with the TS framework. This possibility opens the door for themigration of existing standalone tools (Section 1.4.4) to control panels, and therefore contributes to theharmonization of the underlying technologies for both the expert tools and the TS. This homogeneoustechnological approach has the following benefits: i) smoothing the learning curve of the operators, ii)simplification of the overall L1 trigger OSWI maintenance, and iii) enhancing the sharing of code andexperience.The implementation of a sub-system control panel is equivalent to develop a SubsystemPanel class whichinherits from the CellPanel class (Figure 4-6). This development consists of defining theSubsystemPanel::layout() method following the guidelines of the TS framework user’s guide and using thewidgets of the Ajaxell library [90]. The example of the Global Trigger control panel is presented in Section6.5.1.11Prepare cell context: The cell context, presented in Section 4.4.4.2, is a shared object among allCellObject objects that forms a given cell. The CellAbstractContext object contains the Logger, thexhannels and the factories. The cell context can be extended in order to store sub-system specific sharedobjects like a hardware driver. To extend the cell context it is necessary to define a class descendant ofCellAbstractContext (e.g. SubsystemContext in Figure 4-6). The cell context object has to be created inthe cell constructor and assigned to the context_ attribute. The cell context attribute can be accessed fromany CellObject object, for instance a cell command or operation.Prepare xhannel list file: The preparation of the xhannel list consists of defining the external web serviceproviders that will be used by the cell: other cells, Tstore application to access the configuration database orany other XDAQ application (Section 4.4.4.9). Once the cell is running, the xhannels are accessible throughthe cell context object.CERN-THESIS-2008-169CMS-TS-2008-016oai:cds.cern.ch:14462822008
spellingShingle Detectors and Experimental Techniques
Computing and Computers
Ildefons Magrans de Abril
The CMS Trigger Supervisor: Control and Hardware Monitoring System of the CMS Level-1 Trigger at CERN
title The CMS Trigger Supervisor: Control and Hardware Monitoring System of the CMS Level-1 Trigger at CERN
title_full The CMS Trigger Supervisor: Control and Hardware Monitoring System of the CMS Level-1 Trigger at CERN
title_fullStr The CMS Trigger Supervisor: Control and Hardware Monitoring System of the CMS Level-1 Trigger at CERN
title_full_unstemmed The CMS Trigger Supervisor: Control and Hardware Monitoring System of the CMS Level-1 Trigger at CERN
title_short The CMS Trigger Supervisor: Control and Hardware Monitoring System of the CMS Level-1 Trigger at CERN
title_sort cms trigger supervisor: control and hardware monitoring system of the cms level-1 trigger at cern
topic Detectors and Experimental Techniques
Computing and Computers
url http://cds.cern.ch/record/1446282
work_keys_str_mv AT ildefonsmagransdeabril thecmstriggersupervisorcontrolandhardwaremonitoringsystemofthecmslevel1triggeratcern
AT ildefonsmagransdeabril cmstriggersupervisorcontrolandhardwaremonitoringsystemofthecmslevel1triggeratcern