Cargando…

A New Visual Analytics Toolkit for ATLAS Computing Metadata

The ATLAS experiment at the Large Hadron Collider has a complex heterogeneous distributed computing infrastructure, which is used to process and analyse exabytes of data. Metadata are collected and stored at all stages of data processing and physics analysis. All metadata could be divided into opera...

Descripción completa

Detalles Bibliográficos
Autores principales: Grigorieva, M A, Alekseev, A A, Galkin, T P, Klimentov, A A, Korchuganova, T A, Milman, I E, Padolski, S V, Pilyugin, V V, Titov, M A
Lenguaje:eng
Publicado: 2019
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/1525/1/012086
http://cds.cern.ch/record/2676780
_version_ 1780962744582799360
author Grigorieva, M A
Alekseev, A A
Galkin, T P
Klimentov, A A
Korchuganova, T A
Milman, I E
Padolski, S V
Pilyugin, V V
Titov, M A
author_facet Grigorieva, M A
Alekseev, A A
Galkin, T P
Klimentov, A A
Korchuganova, T A
Milman, I E
Padolski, S V
Pilyugin, V V
Titov, M A
author_sort Grigorieva, M A
collection CERN
description The ATLAS experiment at the Large Hadron Collider has a complex heterogeneous distributed computing infrastructure, which is used to process and analyse exabytes of data. Metadata are collected and stored at all stages of data processing and physics analysis. All metadata could be divided into operational metadata to be used for the quasi on-line monitoring, and archival to study the behaviour of corresponding systems over a given period of time (i.e., long-term data analysis). Ensuring the stability and efficiency of functioning of complex and large-scale systems, such as those in the ATLAS Computing, requires sophisticated monitoring tools, and the long-term monitoring data analysis becomes as important as the monitoring itself. Archival metadata, which contains a lot of metrics (hardware and software environment descriptions, network states, application parameters, errors) accumulated for more than decade, can be successfully processed by various machine learning (ML) algorithms for classification, clustering and dimensionality reduction. However, the ML data analysis, despite the massive use, is not without shortcomings: the underlying algorithms are usually treated as “black boxes”, as there are no effective techniques for understanding their internal mechanisms, and the domain-experts involvement in the process of ML data analysis is very limited. As a result, the data analysis suffers from the lack of human supervision. Moreover, sometimes the conclusions made by algorithms with a high accuracy may have no sense regarding the real data model. In this work we will demonstrate how the interactive data visualization can be applied to extend the routine ML data analysis methods. Visualization allows an active use of human spatial thinking to identify new tendencies and patterns found in the collected data, avoiding the necessity of struggling with the instrumental analytics tools. The architecture and the corresponding prototype of Interactive Visual Explorer (InVEx) - visual analytics toolkit for the multidimensional data analysis of ATLAS computing metadata will be presented. The web-application part of the prototype provides an interactive visual clusterization of ATLAS computing jobs, search for computing jobs non-trivial behaviour and its possible reasons.
id cern-2676780
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2019
record_format invenio
spelling cern-26767802021-10-11T20:30:08Zdoi:10.1088/1742-6596/1525/1/012086http://cds.cern.ch/record/2676780engGrigorieva, M AAlekseev, A AGalkin, T PKlimentov, A AKorchuganova, T AMilman, I EPadolski, S VPilyugin, V VTitov, M AA New Visual Analytics Toolkit for ATLAS Computing MetadataParticle Physics - ExperimentDetectors and Experimental TechniquesComputing and ComputersThe ATLAS experiment at the Large Hadron Collider has a complex heterogeneous distributed computing infrastructure, which is used to process and analyse exabytes of data. Metadata are collected and stored at all stages of data processing and physics analysis. All metadata could be divided into operational metadata to be used for the quasi on-line monitoring, and archival to study the behaviour of corresponding systems over a given period of time (i.e., long-term data analysis). Ensuring the stability and efficiency of functioning of complex and large-scale systems, such as those in the ATLAS Computing, requires sophisticated monitoring tools, and the long-term monitoring data analysis becomes as important as the monitoring itself. Archival metadata, which contains a lot of metrics (hardware and software environment descriptions, network states, application parameters, errors) accumulated for more than decade, can be successfully processed by various machine learning (ML) algorithms for classification, clustering and dimensionality reduction. However, the ML data analysis, despite the massive use, is not without shortcomings: the underlying algorithms are usually treated as “black boxes”, as there are no effective techniques for understanding their internal mechanisms, and the domain-experts involvement in the process of ML data analysis is very limited. As a result, the data analysis suffers from the lack of human supervision. Moreover, sometimes the conclusions made by algorithms with a high accuracy may have no sense regarding the real data model. In this work we will demonstrate how the interactive data visualization can be applied to extend the routine ML data analysis methods. Visualization allows an active use of human spatial thinking to identify new tendencies and patterns found in the collected data, avoiding the necessity of struggling with the instrumental analytics tools. The architecture and the corresponding prototype of Interactive Visual Explorer (InVEx) - visual analytics toolkit for the multidimensional data analysis of ATLAS computing metadata will be presented. The web-application part of the prototype provides an interactive visual clusterization of ATLAS computing jobs, search for computing jobs non-trivial behaviour and its possible reasons.The ATLAS experiment at the Large Hadron Collider has a complex heterogeneous distributed computing infrastructure, which is used to process and analyse exabytes of data. Metadata are collected and stored at all stages of data processing and physics analysis. All metadata could be divided into operational metadata to be used for the quasi on-line monitoring, and archival to study the behaviour of corresponding systems over a given period of time (i.e. long-term data analysis). Ensuring the stability and efficiency of complex and large-scale systems, such as those in the ATLAS Computing, requires sophisticated monitoring tools, and the long-term monitoring data analysis becomes as important as the monitoring itself. Archival metadata, which contains a lot of metrics (hardware and software environment descriptions, network states, application parameters, errors) accumulated for more than a decade, can be successfully processed by various machine learning (ML) algorithms for classification, clustering and dimensionality reduction. However, the ML data analysis, despite the massive use, is not without shortcomings: the underlying algorithms are usually treated as “black boxes”, as there are no effective techniques for understanding their internal mechanisms. As a result, the data analysis suffers from the lack of human supervision. Moreover, sometimes the conclusions made by algorithms may not be making sense with regard to the real data model. In this work we will demonstrate how the interactive data visualization can be applied to extend the routine ML data analysis methods. Visualization allows an active use of human spatial thinking to identify new tendencies and patterns found in the collected data, avoiding the necessity of struggling with the instrumental analytics tools. The architecture and the corresponding prototype of Interactive Visual Explorer (InVEx) - visual analytics toolkit for the multidimensional data analysis of ATLAS computing metadata will be presented. The web-application part of the prototype provides an interactive visual clusterization of ATLAS computing jobs, search for computing jobs non-trivial behaviour and its possible reasons.ATL-SOFT-PROC-2019-005oai:cds.cern.ch:26767802019-05-29
spellingShingle Particle Physics - Experiment
Detectors and Experimental Techniques
Computing and Computers
Grigorieva, M A
Alekseev, A A
Galkin, T P
Klimentov, A A
Korchuganova, T A
Milman, I E
Padolski, S V
Pilyugin, V V
Titov, M A
A New Visual Analytics Toolkit for ATLAS Computing Metadata
title A New Visual Analytics Toolkit for ATLAS Computing Metadata
title_full A New Visual Analytics Toolkit for ATLAS Computing Metadata
title_fullStr A New Visual Analytics Toolkit for ATLAS Computing Metadata
title_full_unstemmed A New Visual Analytics Toolkit for ATLAS Computing Metadata
title_short A New Visual Analytics Toolkit for ATLAS Computing Metadata
title_sort new visual analytics toolkit for atlas computing metadata
topic Particle Physics - Experiment
Detectors and Experimental Techniques
Computing and Computers
url https://dx.doi.org/10.1088/1742-6596/1525/1/012086
http://cds.cern.ch/record/2676780
work_keys_str_mv AT grigorievama anewvisualanalyticstoolkitforatlascomputingmetadata
AT alekseevaa anewvisualanalyticstoolkitforatlascomputingmetadata
AT galkintp anewvisualanalyticstoolkitforatlascomputingmetadata
AT klimentovaa anewvisualanalyticstoolkitforatlascomputingmetadata
AT korchuganovata anewvisualanalyticstoolkitforatlascomputingmetadata
AT milmanie anewvisualanalyticstoolkitforatlascomputingmetadata
AT padolskisv anewvisualanalyticstoolkitforatlascomputingmetadata
AT pilyuginvv anewvisualanalyticstoolkitforatlascomputingmetadata
AT titovma anewvisualanalyticstoolkitforatlascomputingmetadata
AT grigorievama newvisualanalyticstoolkitforatlascomputingmetadata
AT alekseevaa newvisualanalyticstoolkitforatlascomputingmetadata
AT galkintp newvisualanalyticstoolkitforatlascomputingmetadata
AT klimentovaa newvisualanalyticstoolkitforatlascomputingmetadata
AT korchuganovata newvisualanalyticstoolkitforatlascomputingmetadata
AT milmanie newvisualanalyticstoolkitforatlascomputingmetadata
AT padolskisv newvisualanalyticstoolkitforatlascomputingmetadata
AT pilyuginvv newvisualanalyticstoolkitforatlascomputingmetadata
AT titovma newvisualanalyticstoolkitforatlascomputingmetadata