Cargando…

Using DODAS as deployment manager for smart caching of CMS data management system

DODAS stands for Dynamic On Demand Analysis Service and is a Platform as a Service toolkit built around several EOSC-hub services designed to instantiate and configure on-demand container-based clusters over public or private Cloud resources. It automates the whole workflow from service provisioning...

Descripción completa

Detalles Bibliográficos
Autor principal: Tracolli, Mirco
Lenguaje:eng
Publicado: 2019
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/1525/1/012057
http://cds.cern.ch/record/2676848
_version_ 1780962750973870080
author Tracolli, Mirco
author_facet Tracolli, Mirco
author_sort Tracolli, Mirco
collection CERN
description DODAS stands for Dynamic On Demand Analysis Service and is a Platform as a Service toolkit built around several EOSC-hub services designed to instantiate and configure on-demand container-based clusters over public or private Cloud resources. It automates the whole workflow from service provisioning to the configuration and setup of software applications. Therefore, such a solution allows using "any cloud provider", with almost zero effort. In this paper, we demonstrate how DODAS can be adopted as a deployment manager to set up and manage the compute resources and services required to develop an AI solution for smart data caching. The smart caching layer may reduce the operational cost and increase flexibility with respect to regular centrally managed storage of the current CMS computing model. The cache space should be dynamically populated with the most requested data. In addition, clustering such caching systems will allow to operate them as Content Delivery System between data providers and end-users. Moreover, a geographically distributed caching layer will be functional also to a data-lake based model, where many satellite computing centers might appear and disappear dynamically. In this context, our strategy is to develop a flexible and automated AI environment for smart management of the content of such clustered cache system. In this contribution, we will describe the identified computational phases required for the AI environment implementation, as well as the related DODAS integration. Therefore we will start with the overview of the architecture for the pre-processing step, based on Spark, which has the role to prepare data for a Machine Learning technique. A focus will be given on the automation implemented through DODAS. Then, we will show how to train an AI-based smart cache and how we implemented a training facility managed through DODAS. Finally, we provide an overview of the inference system, based on the CMS-TensorFlow as a Service and also deployed as a DODAS service.
id cern-2676848
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2019
record_format invenio
spelling cern-26768482022-11-17T14:32:33Zdoi:10.1088/1742-6596/1525/1/012057http://cds.cern.ch/record/2676848engTracolli, MircoUsing DODAS as deployment manager for smart caching of CMS data management systemDetectors and Experimental TechniquesDODAS stands for Dynamic On Demand Analysis Service and is a Platform as a Service toolkit built around several EOSC-hub services designed to instantiate and configure on-demand container-based clusters over public or private Cloud resources. It automates the whole workflow from service provisioning to the configuration and setup of software applications. Therefore, such a solution allows using "any cloud provider", with almost zero effort. In this paper, we demonstrate how DODAS can be adopted as a deployment manager to set up and manage the compute resources and services required to develop an AI solution for smart data caching. The smart caching layer may reduce the operational cost and increase flexibility with respect to regular centrally managed storage of the current CMS computing model. The cache space should be dynamically populated with the most requested data. In addition, clustering such caching systems will allow to operate them as Content Delivery System between data providers and end-users. Moreover, a geographically distributed caching layer will be functional also to a data-lake based model, where many satellite computing centers might appear and disappear dynamically. In this context, our strategy is to develop a flexible and automated AI environment for smart management of the content of such clustered cache system. In this contribution, we will describe the identified computational phases required for the AI environment implementation, as well as the related DODAS integration. Therefore we will start with the overview of the architecture for the pre-processing step, based on Spark, which has the role to prepare data for a Machine Learning technique. A focus will be given on the automation implemented through DODAS. Then, we will show how to train an AI-based smart cache and how we implemented a training facility managed through DODAS. Finally, we provide an overview of the inference system, based on the CMS-TensorFlow as a Service and also deployed as a DODAS service.CMS-CR-2019-066oai:cds.cern.ch:26768482019-05-16
spellingShingle Detectors and Experimental Techniques
Tracolli, Mirco
Using DODAS as deployment manager for smart caching of CMS data management system
title Using DODAS as deployment manager for smart caching of CMS data management system
title_full Using DODAS as deployment manager for smart caching of CMS data management system
title_fullStr Using DODAS as deployment manager for smart caching of CMS data management system
title_full_unstemmed Using DODAS as deployment manager for smart caching of CMS data management system
title_short Using DODAS as deployment manager for smart caching of CMS data management system
title_sort using dodas as deployment manager for smart caching of cms data management system
topic Detectors and Experimental Techniques
url https://dx.doi.org/10.1088/1742-6596/1525/1/012057
http://cds.cern.ch/record/2676848
work_keys_str_mv AT tracollimirco usingdodasasdeploymentmanagerforsmartcachingofcmsdatamanagementsystem