Cargando…

Accessing data federations with CVMFS

Data federations have become an increasingly common tool for large collaborations such as CMS and Atlas to efficiently distribute large data files. Unfortunately, these typically are implemented with weak namespace semantics and a non-POSIX API. On the other hand, CVMFS has provided a POSIX-complian...

Descripción completa

Detalles Bibliográficos
Autores principales: Weitzel, Derek, Bockelman, Brian, Dykstra, Dave, Blomer, Jakob, Meusel, Ren
Lenguaje:eng
Publicado: 2017
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/898/6/062044
http://cds.cern.ch/record/2298614
_version_ 1780957024615399424
author Weitzel, Derek
Bockelman, Brian
Dykstra, Dave
Blomer, Jakob
Meusel, Ren
author_facet Weitzel, Derek
Bockelman, Brian
Dykstra, Dave
Blomer, Jakob
Meusel, Ren
author_sort Weitzel, Derek
collection CERN
description Data federations have become an increasingly common tool for large collaborations such as CMS and Atlas to efficiently distribute large data files. Unfortunately, these typically are implemented with weak namespace semantics and a non-POSIX API. On the other hand, CVMFS has provided a POSIX-compliant read-only interface for use cases with a small working set size (such as software distribution). The metadata required for the CVMFS POSIX interface is distributed through a caching hierarchy, allowing it to scale to the level of about a hundred thousand hosts. In this paper, we will describe our contributions to CVMFS that merges the data scalability of XRootD-based data federations (such as AAA) with metadata scalability and POSIX interface of CVMFS. We modified CVMFS so it can serve unmodified files without copying them to the repository server. CVMFS 2.2.0 is also able to redirect requests for data files to servers outside of the CVMFS content distribution network. Finally, we added the ability to manage authorization and authentication using security credentials such as X509 proxy certificates. We combined these modifications with the OSGs StashCache regional XRootD caching infrastructure to create a cached data distribution network. We will show performance metrics accessing the data federation through CVMFS compared to direct data federation access. Additionally, we will discuss the improved user experience of providing access to a data federation through a POSIX filesystem.
id oai-inspirehep.net-1630215
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2017
record_format invenio
spelling oai-inspirehep.net-16302152021-02-09T10:05:59Zdoi:10.1088/1742-6596/898/6/062044http://cds.cern.ch/record/2298614engWeitzel, DerekBockelman, BrianDykstra, DaveBlomer, JakobMeusel, RenAccessing data federations with CVMFSComputing and ComputersData federations have become an increasingly common tool for large collaborations such as CMS and Atlas to efficiently distribute large data files. Unfortunately, these typically are implemented with weak namespace semantics and a non-POSIX API. On the other hand, CVMFS has provided a POSIX-compliant read-only interface for use cases with a small working set size (such as software distribution). The metadata required for the CVMFS POSIX interface is distributed through a caching hierarchy, allowing it to scale to the level of about a hundred thousand hosts. In this paper, we will describe our contributions to CVMFS that merges the data scalability of XRootD-based data federations (such as AAA) with metadata scalability and POSIX interface of CVMFS. We modified CVMFS so it can serve unmodified files without copying them to the repository server. CVMFS 2.2.0 is also able to redirect requests for data files to servers outside of the CVMFS content distribution network. Finally, we added the ability to manage authorization and authentication using security credentials such as X509 proxy certificates. We combined these modifications with the OSGs StashCache regional XRootD caching infrastructure to create a cached data distribution network. We will show performance metrics accessing the data federation through CVMFS compared to direct data federation access. Additionally, we will discuss the improved user experience of providing access to a data federation through a POSIX filesystem.FERMILAB-CONF-17-407-CDoai:inspirehep.net:16302152017
spellingShingle Computing and Computers
Weitzel, Derek
Bockelman, Brian
Dykstra, Dave
Blomer, Jakob
Meusel, Ren
Accessing data federations with CVMFS
title Accessing data federations with CVMFS
title_full Accessing data federations with CVMFS
title_fullStr Accessing data federations with CVMFS
title_full_unstemmed Accessing data federations with CVMFS
title_short Accessing data federations with CVMFS
title_sort accessing data federations with cvmfs
topic Computing and Computers
url https://dx.doi.org/10.1088/1742-6596/898/6/062044
http://cds.cern.ch/record/2298614
work_keys_str_mv AT weitzelderek accessingdatafederationswithcvmfs
AT bockelmanbrian accessingdatafederationswithcvmfs
AT dykstradave accessingdatafederationswithcvmfs
AT blomerjakob accessingdatafederationswithcvmfs
AT meuselren accessingdatafederationswithcvmfs