Cargando…

A New Visual Analytics toolkit for ATLAS metadata

The ATLAS experiment at the LHC has a complex heterogeneous distributed computing infrastructure, which is used to process and analyse exabytes of data. Metadata are collected and stored at all stages of physics analysis and data processing. All metadata could be divided into operational metadata to...

Descripción completa

Detalles Bibliográficos
Autores principales: Grigoryeva, Maria, Titov, Mikhail, Klimentov, Alexei, Korchuganova, Tatiana, Alekseev, Aleksandr, Galkin, Timofei, Piliugin, Viktor, Padolski, Siarhei
Lenguaje:eng
Publicado: 2019
Materias:
Acceso en línea:http://cds.cern.ch/record/2661474
_version_ 1780961526549577728
author Grigoryeva, Maria
Titov, Mikhail
Klimentov, Alexei
Korchuganova, Tatiana
Alekseev, Aleksandr
Galkin, Timofei
Piliugin, Viktor
Padolski, Siarhei
author_facet Grigoryeva, Maria
Titov, Mikhail
Klimentov, Alexei
Korchuganova, Tatiana
Alekseev, Aleksandr
Galkin, Timofei
Piliugin, Viktor
Padolski, Siarhei
author_sort Grigoryeva, Maria
collection CERN
description The ATLAS experiment at the LHC has a complex heterogeneous distributed computing infrastructure, which is used to process and analyse exabytes of data. Metadata are collected and stored at all stages of physics analysis and data processing. All metadata could be divided into operational metadata to be used for the quasi on-line monitoring, and archival to study the systems’ behaviour over a given period of time (i.e., long-term data analysis). Ensuring the stability and efficiency of functioning of complex and large-scale systems, such as those in ATLAS computing, requires sophisticated monitoring tools, and the long-term monitoring data analysis becomes as important as the monitoring itself. Archival metadata, containing a lot of metrics (hardware and software environment descriptions, network state, application parameters, user account data, errors) accumulated for more than decade, can be successfully processed by various machine learning (ML) algorithms for classification, clustering and dimensionality reduction. However, the ML data analysis, despite the massive use, is not without shortcomings: the underlying algorithms are usually treated as “black boxes”, as there are no effective techniques for understanding their internal mechanisms, and the domain-experts involvement in the process of ML data analysis is very limited. As a result the data analysis suffers from the lack of human supervision. Moreover, sometimes the conclusions made by the algorithms with a high accuracy may have no sense regarding the real data model. In this work we will demonstrate how the interactive data visualization can be applied to extend the routine ML data analysis methods. Visualization allows to actively use human spatial thinking to identify new tendencies and patterns found in the collected data, avoiding the necessity of struggling with the instrumental analytics tools. The architecture and the interface prototype of visual analytics platform (VAP) for the multidimensional data analysis of ATLAS computing metadata will be presented. The general data processing and visualization methods of the VAP prototype will be implemented and tested on the slice of ATLAS jobs metadata. As a result, a web-interface will provide ATLAS jobs interactive visual clusterization and search for non-trivial behaviour and its possible reasons. Furthermore, we will demonstrate the prototype of dynamic interactive visualization, providing the possibility of the observation of changing clustering structure in different points in time.
id cern-2661474
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2019
record_format invenio
spelling cern-26614742019-09-30T06:29:59Zhttp://cds.cern.ch/record/2661474engGrigoryeva, MariaTitov, MikhailKlimentov, AlexeiKorchuganova, TatianaAlekseev, AleksandrGalkin, TimofeiPiliugin, ViktorPadolski, SiarheiA New Visual Analytics toolkit for ATLAS metadataParticle Physics - ExperimentThe ATLAS experiment at the LHC has a complex heterogeneous distributed computing infrastructure, which is used to process and analyse exabytes of data. Metadata are collected and stored at all stages of physics analysis and data processing. All metadata could be divided into operational metadata to be used for the quasi on-line monitoring, and archival to study the systems’ behaviour over a given period of time (i.e., long-term data analysis). Ensuring the stability and efficiency of functioning of complex and large-scale systems, such as those in ATLAS computing, requires sophisticated monitoring tools, and the long-term monitoring data analysis becomes as important as the monitoring itself. Archival metadata, containing a lot of metrics (hardware and software environment descriptions, network state, application parameters, user account data, errors) accumulated for more than decade, can be successfully processed by various machine learning (ML) algorithms for classification, clustering and dimensionality reduction. However, the ML data analysis, despite the massive use, is not without shortcomings: the underlying algorithms are usually treated as “black boxes”, as there are no effective techniques for understanding their internal mechanisms, and the domain-experts involvement in the process of ML data analysis is very limited. As a result the data analysis suffers from the lack of human supervision. Moreover, sometimes the conclusions made by the algorithms with a high accuracy may have no sense regarding the real data model. In this work we will demonstrate how the interactive data visualization can be applied to extend the routine ML data analysis methods. Visualization allows to actively use human spatial thinking to identify new tendencies and patterns found in the collected data, avoiding the necessity of struggling with the instrumental analytics tools. The architecture and the interface prototype of visual analytics platform (VAP) for the multidimensional data analysis of ATLAS computing metadata will be presented. The general data processing and visualization methods of the VAP prototype will be implemented and tested on the slice of ATLAS jobs metadata. As a result, a web-interface will provide ATLAS jobs interactive visual clusterization and search for non-trivial behaviour and its possible reasons. Furthermore, we will demonstrate the prototype of dynamic interactive visualization, providing the possibility of the observation of changing clustering structure in different points in time.ATL-SOFT-SLIDE-2019-052oai:cds.cern.ch:26614742019-02-21
spellingShingle Particle Physics - Experiment
Grigoryeva, Maria
Titov, Mikhail
Klimentov, Alexei
Korchuganova, Tatiana
Alekseev, Aleksandr
Galkin, Timofei
Piliugin, Viktor
Padolski, Siarhei
A New Visual Analytics toolkit for ATLAS metadata
title A New Visual Analytics toolkit for ATLAS metadata
title_full A New Visual Analytics toolkit for ATLAS metadata
title_fullStr A New Visual Analytics toolkit for ATLAS metadata
title_full_unstemmed A New Visual Analytics toolkit for ATLAS metadata
title_short A New Visual Analytics toolkit for ATLAS metadata
title_sort new visual analytics toolkit for atlas metadata
topic Particle Physics - Experiment
url http://cds.cern.ch/record/2661474
work_keys_str_mv AT grigoryevamaria anewvisualanalyticstoolkitforatlasmetadata
AT titovmikhail anewvisualanalyticstoolkitforatlasmetadata
AT klimentovalexei anewvisualanalyticstoolkitforatlasmetadata
AT korchuganovatatiana anewvisualanalyticstoolkitforatlasmetadata
AT alekseevaleksandr anewvisualanalyticstoolkitforatlasmetadata
AT galkintimofei anewvisualanalyticstoolkitforatlasmetadata
AT piliuginviktor anewvisualanalyticstoolkitforatlasmetadata
AT padolskisiarhei anewvisualanalyticstoolkitforatlasmetadata
AT grigoryevamaria newvisualanalyticstoolkitforatlasmetadata
AT titovmikhail newvisualanalyticstoolkitforatlasmetadata
AT klimentovalexei newvisualanalyticstoolkitforatlasmetadata
AT korchuganovatatiana newvisualanalyticstoolkitforatlasmetadata
AT alekseevaleksandr newvisualanalyticstoolkitforatlasmetadata
AT galkintimofei newvisualanalyticstoolkitforatlasmetadata
AT piliuginviktor newvisualanalyticstoolkitforatlasmetadata
AT padolskisiarhei newvisualanalyticstoolkitforatlasmetadata