Cargando…

A comparison of different database technologies for the CMS AsyncStageOut transfer database

AsyncStageOut (ASO) is the component of the CMS distributed data analysis system (CRAB) that manages users transfers in a centrally controlled way using the File Transfer System (FTS3) at CERN. It addresses a major weakness of the previous, decentralized model, namely that the transfer of the user’s...

Descripción completa

Detalles Bibliográficos
Autores principales: Ciangottini, D, Balcas, J, Mascheroni, M, Rupeika, E A, Vaandering, E, Riahi, H, Silva, J M D, Hernandez, J M, Belforte, S, Ivanov, T T
Lenguaje:eng
Publicado: 2017
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/898/4/042048
http://cds.cern.ch/record/2297286
_version_ 1780956899764600832
author Ciangottini, D
Balcas, J
Mascheroni, M
Rupeika, E A
Vaandering, E
Riahi, H
Silva, J M D
Hernandez, J M
Belforte, S
Ivanov, T T
author_facet Ciangottini, D
Balcas, J
Mascheroni, M
Rupeika, E A
Vaandering, E
Riahi, H
Silva, J M D
Hernandez, J M
Belforte, S
Ivanov, T T
author_sort Ciangottini, D
collection CERN
description AsyncStageOut (ASO) is the component of the CMS distributed data analysis system (CRAB) that manages users transfers in a centrally controlled way using the File Transfer System (FTS3) at CERN. It addresses a major weakness of the previous, decentralized model, namely that the transfer of the user’s output data to a single remote site was part of the job execution, resulting in inefficient use of job slots and an unacceptable failure rate. Currently ASO manages up to 600k files of various sizes per day from more than 500 users per month, spread over more than 100 sites. ASO uses a NoSQL database (CouchDB) as internal bookkeeping and as way to communicate with other CRAB components. Since ASO/CRAB were put in production in 2014, the number of transfers constantly increased up to a point where the pressure to the central CouchDB instance became critical, creating new challenges for the system scalability, performance, and monitoring. This forced a re-engineering of the ASO application to increase its scalability and lowering its operational effort. In this contribution we present a comparison of the performance of the current NoSQL implementation and a new SQL implementation, and how their different strengths and features influenced the design choices and operational experience. We also discuss other architectural changes introduced in the system to handle the increasing load and latency in delivering output to the user.
id oai-inspirehep.net-1638458
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2017
record_format invenio
spelling oai-inspirehep.net-16384582021-02-09T10:07:03Zdoi:10.1088/1742-6596/898/4/042048http://cds.cern.ch/record/2297286engCiangottini, DBalcas, JMascheroni, MRupeika, E AVaandering, ERiahi, HSilva, J M DHernandez, J MBelforte, SIvanov, T TA comparison of different database technologies for the CMS AsyncStageOut transfer databaseComputing and ComputersAsyncStageOut (ASO) is the component of the CMS distributed data analysis system (CRAB) that manages users transfers in a centrally controlled way using the File Transfer System (FTS3) at CERN. It addresses a major weakness of the previous, decentralized model, namely that the transfer of the user’s output data to a single remote site was part of the job execution, resulting in inefficient use of job slots and an unacceptable failure rate. Currently ASO manages up to 600k files of various sizes per day from more than 500 users per month, spread over more than 100 sites. ASO uses a NoSQL database (CouchDB) as internal bookkeeping and as way to communicate with other CRAB components. Since ASO/CRAB were put in production in 2014, the number of transfers constantly increased up to a point where the pressure to the central CouchDB instance became critical, creating new challenges for the system scalability, performance, and monitoring. This forced a re-engineering of the ASO application to increase its scalability and lowering its operational effort. In this contribution we present a comparison of the performance of the current NoSQL implementation and a new SQL implementation, and how their different strengths and features influenced the design choices and operational experience. We also discuss other architectural changes introduced in the system to handle the increasing load and latency in delivering output to the user.oai:inspirehep.net:16384582017
spellingShingle Computing and Computers
Ciangottini, D
Balcas, J
Mascheroni, M
Rupeika, E A
Vaandering, E
Riahi, H
Silva, J M D
Hernandez, J M
Belforte, S
Ivanov, T T
A comparison of different database technologies for the CMS AsyncStageOut transfer database
title A comparison of different database technologies for the CMS AsyncStageOut transfer database
title_full A comparison of different database technologies for the CMS AsyncStageOut transfer database
title_fullStr A comparison of different database technologies for the CMS AsyncStageOut transfer database
title_full_unstemmed A comparison of different database technologies for the CMS AsyncStageOut transfer database
title_short A comparison of different database technologies for the CMS AsyncStageOut transfer database
title_sort comparison of different database technologies for the cms asyncstageout transfer database
topic Computing and Computers
url https://dx.doi.org/10.1088/1742-6596/898/4/042048
http://cds.cern.ch/record/2297286
work_keys_str_mv AT ciangottinid acomparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT balcasj acomparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT mascheronim acomparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT rupeikaea acomparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT vaanderinge acomparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT riahih acomparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT silvajmd acomparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT hernandezjm acomparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT belfortes acomparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT ivanovtt acomparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT ciangottinid comparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT balcasj comparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT mascheronim comparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT rupeikaea comparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT vaanderinge comparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT riahih comparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT silvajmd comparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT hernandezjm comparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT belfortes comparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase
AT ivanovtt comparisonofdifferentdatabasetechnologiesforthecmsasyncstageouttransferdatabase