Cargando…
Grid site availability evaluation and monitoring at CMS
The Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) uses distributed grid computing to store, process, and analyse the vast quantity of scientific data recorded every year. The computing resources are grouped into sites and organized in a tiered structure. Each site provide...
Autores principales: | , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2017
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1088/1742-6596/898/9/092014 http://cds.cern.ch/record/2296664 |
_version_ | 1780956907943493632 |
---|---|
author | Lyons, Gaston Maciulaitis, Rokas Bagliesi, Giuseppe Lammel, Stephan Sciabà, Andrea |
author_facet | Lyons, Gaston Maciulaitis, Rokas Bagliesi, Giuseppe Lammel, Stephan Sciabà, Andrea |
author_sort | Lyons, Gaston |
collection | CERN |
description | The Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) uses distributed grid computing to store, process, and analyse the vast quantity of scientific data recorded every year. The computing resources are grouped into sites and organized in a tiered structure. Each site provides computing and storage to the CMS computing grid. Over a hundred sites worldwide contribute with resources from hundred to well over ten thousand computing cores and storage from tens of TBytes to tens of PBytes. In such a large computing setup scheduled and unscheduled outages occur continually and are not allowed to significantly impact data handling, processing, and analysis. Unscheduled capacity and performance reductions need to be detected promptly and corrected. CMS developed a sophisticated site evaluation and monitoring system for Run 1 of the LHC based on tools of the Worldwide LHC Computing Grid. For Run 2 of the LHC the site evaluation and monitoring system is being overhauled to enable faster detection/reaction to failures and a more dynamic handling of computing resources. Enhancements to better distinguish site from central service issues and to make evaluations more transparent and informative to site support staff are planned. |
id | oai-inspirehep.net-1638611 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2017 |
record_format | invenio |
spelling | oai-inspirehep.net-16386112021-02-09T10:06:08Zdoi:10.1088/1742-6596/898/9/092014http://cds.cern.ch/record/2296664engLyons, GastonMaciulaitis, RokasBagliesi, GiuseppeLammel, StephanSciabà, AndreaGrid site availability evaluation and monitoring at CMSComputing and ComputersThe Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) uses distributed grid computing to store, process, and analyse the vast quantity of scientific data recorded every year. The computing resources are grouped into sites and organized in a tiered structure. Each site provides computing and storage to the CMS computing grid. Over a hundred sites worldwide contribute with resources from hundred to well over ten thousand computing cores and storage from tens of TBytes to tens of PBytes. In such a large computing setup scheduled and unscheduled outages occur continually and are not allowed to significantly impact data handling, processing, and analysis. Unscheduled capacity and performance reductions need to be detected promptly and corrected. CMS developed a sophisticated site evaluation and monitoring system for Run 1 of the LHC based on tools of the Worldwide LHC Computing Grid. For Run 2 of the LHC the site evaluation and monitoring system is being overhauled to enable faster detection/reaction to failures and a more dynamic handling of computing resources. Enhancements to better distinguish site from central service issues and to make evaluations more transparent and informative to site support staff are planned.oai:inspirehep.net:16386112017 |
spellingShingle | Computing and Computers Lyons, Gaston Maciulaitis, Rokas Bagliesi, Giuseppe Lammel, Stephan Sciabà, Andrea Grid site availability evaluation and monitoring at CMS |
title | Grid site availability evaluation and monitoring at CMS |
title_full | Grid site availability evaluation and monitoring at CMS |
title_fullStr | Grid site availability evaluation and monitoring at CMS |
title_full_unstemmed | Grid site availability evaluation and monitoring at CMS |
title_short | Grid site availability evaluation and monitoring at CMS |
title_sort | grid site availability evaluation and monitoring at cms |
topic | Computing and Computers |
url | https://dx.doi.org/10.1088/1742-6596/898/9/092014 http://cds.cern.ch/record/2296664 |
work_keys_str_mv | AT lyonsgaston gridsiteavailabilityevaluationandmonitoringatcms AT maciulaitisrokas gridsiteavailabilityevaluationandmonitoringatcms AT bagliesigiuseppe gridsiteavailabilityevaluationandmonitoringatcms AT lammelstephan gridsiteavailabilityevaluationandmonitoringatcms AT sciabaandrea gridsiteavailabilityevaluationandmonitoringatcms |