Cargando…

Evaluation of software based redundancy algorithms for the EOS storage system at CERN

EOS is a new disk based storage system used in production at CERN since autumn 2011. It is implemented using the plug-in architecture of the XRootD software framework and allows remote file access via XRootD protocol or POSIX-like file access via FUSE mounting. EOS was designed to fulfill specific r...

Descripción completa

Detalles Bibliográficos
Autores principales: Peters, Andreas-Joachim, Sindrilaru, Elvin Alin, Zigann, Philipp
Lenguaje:eng
Publicado: 2012
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/396/4/042046
http://cds.cern.ch/record/1565909
Descripción
Sumario:EOS is a new disk based storage system used in production at CERN since autumn 2011. It is implemented using the plug-in architecture of the XRootD software framework and allows remote file access via XRootD protocol or POSIX-like file access via FUSE mounting. EOS was designed to fulfill specific requirements of disk storage scalability and IO scheduling performance for LHC analysis use cases. This is achieved by following a strategy of decoupling disk and tape storage as individual storage systems. A key point of the EOS design is to provide high availability and redundancy of files via a software implementation which uses disk-only storage systems without hardware RAID arrays. All this is aimed at reducing the overall cost of the system and also simplifying the operational procedures. This paper presents the advantages and disadvantages of redundancy by hardware (most classical storage installations) in comparison to redundancy by software. The latter is implemented in the EOS system and achieves its goal by spawning data and parity stripes via remote file access over nodes. The gain in redundancy and reliability comes with a trade-off in the following areas: Increased complexity of the network connectivity CPU intensive parity computations during file creation and recovery Performance loss through remote disk coupling An evaluation and performance figures of several redundancy algorithms are presented for dual parity RAID and Reed-Solomon codecs. Moreover, the characteristics and applicability of these algorithms are discussed in the context of reliable data storage systems.