Cargando…

Allocation Optimization for the ATLAS Rebalancing Data Service

The distributed data management system Rucio manages all data of the ATLAS collaboration across the grid. Automation such as replication and rebalancing are an important part to ensure the minimum workflow execution times. In this paper, a new rebalancing algorithm based on machine learning is propo...

Descripción completa

Detalles Bibliográficos
Autores principales: Vamosi, Ralf, Lassnig, Mario
Lenguaje:eng
Publicado: 2018
Materias:
Acceso en línea:http://cds.cern.ch/record/2628355
_version_ 1780959165375578112
author Vamosi, Ralf
Lassnig, Mario
author_facet Vamosi, Ralf
Lassnig, Mario
author_sort Vamosi, Ralf
collection CERN
description The distributed data management system Rucio manages all data of the ATLAS collaboration across the grid. Automation such as replication and rebalancing are an important part to ensure the minimum workflow execution times. In this paper, a new rebalancing algorithm based on machine learning is proposed. First, it can run independently of the existing rebalancing mechanism and can be modularised. It collects data from other services and learns optimality as it runs in the background. Periodically this learning agent takes a subset of the global datasets and proposes them for redistribution to reduce waiting times. The user can interact and choose to accept, decline, or override the dataset placement suggestions. The accepted items are shifted continuously between destination data centres as a background service while taking network and storage utilisation into account.
id cern-2628355
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2018
record_format invenio
spelling cern-26283552019-11-07T12:18:12Zhttp://cds.cern.ch/record/2628355engVamosi, RalfLassnig, MarioAllocation Optimization for the ATLAS Rebalancing Data ServiceParticle Physics - ExperimentThe distributed data management system Rucio manages all data of the ATLAS collaboration across the grid. Automation such as replication and rebalancing are an important part to ensure the minimum workflow execution times. In this paper, a new rebalancing algorithm based on machine learning is proposed. First, it can run independently of the existing rebalancing mechanism and can be modularised. It collects data from other services and learns optimality as it runs in the background. Periodically this learning agent takes a subset of the global datasets and proposes them for redistribution to reduce waiting times. The user can interact and choose to accept, decline, or override the dataset placement suggestions. The accepted items are shifted continuously between destination data centres as a background service while taking network and storage utilisation into account.ATL-SOFT-SLIDE-2018-450oai:cds.cern.ch:26283552018-07-03
spellingShingle Particle Physics - Experiment
Vamosi, Ralf
Lassnig, Mario
Allocation Optimization for the ATLAS Rebalancing Data Service
title Allocation Optimization for the ATLAS Rebalancing Data Service
title_full Allocation Optimization for the ATLAS Rebalancing Data Service
title_fullStr Allocation Optimization for the ATLAS Rebalancing Data Service
title_full_unstemmed Allocation Optimization for the ATLAS Rebalancing Data Service
title_short Allocation Optimization for the ATLAS Rebalancing Data Service
title_sort allocation optimization for the atlas rebalancing data service
topic Particle Physics - Experiment
url http://cds.cern.ch/record/2628355
work_keys_str_mv AT vamosiralf allocationoptimizationfortheatlasrebalancingdataservice
AT lassnigmario allocationoptimizationfortheatlasrebalancingdataservice