Cargando…

ECFS: A decentralized, distributed and fault-tolerant FUSE filesystem for the LHCb online farm

The LHCb experiment records millions of proton collisions every second, but only a fraction of them are useful for LHCb physics. In order to filter out the 'bad events' a large farm of x86-servers (~2000 nodes) has been put in place. These servers boot from and run from NFS, however they u...

Descripción completa

Detalles Bibliográficos
Autores principales: Rybczynski, Tomasz, Bonaccorsi, Enrico, Neufeld, Niko
Publicado: 2014
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/513/4/042038
http://cds.cern.ch/record/2055710
_version_ 1780948310708715520
author Rybczynski, Tomasz
Bonaccorsi, Enrico
Neufeld, Niko
author_facet Rybczynski, Tomasz
Bonaccorsi, Enrico
Neufeld, Niko
author_sort Rybczynski, Tomasz
collection CERN
description The LHCb experiment records millions of proton collisions every second, but only a fraction of them are useful for LHCb physics. In order to filter out the 'bad events' a large farm of x86-servers (~2000 nodes) has been put in place. These servers boot from and run from NFS, however they use their local disk to temporarily store data, which cannot be processed in real-time ('data-deferring'). These events are subsequently processed, when there are no live-data coming in. The effective CPU power is thus greatly increased. This gain in CPU power depends critically on the availability of the local disks. For cost and power-reasons, mirroring (RAID-1) is not used, leading to a lot of operational headache with failing disks and disk-errors or server failures induced by faulty disks. To mitigate these problems and increase the reliability of the LHCb farm, while at same time keeping cost and power-consumption low, an extensive research and study of existing highly available and distributed file systems has been done. While many distributed file systems are providing reliability by 'file replication', none of the evaluated ones supports erasure algorithms. A decentralised, distributed and fault-tolerant 'write once read many' file system has been designed and implemented as a proof of concept providing fault tolerance without using expensive – in terms of disk space – file replication techniques and providing a unique namespace as a main goals. This paper describes the design and the implementation of the Erasure Codes File System (ECFS) and presents the specialised FUSE interface for Linux. Depending on the encoding algorithm ECFS will use a certain number of target directories as a backend to store the segments that compose the encoded data. When target directories are mounted via nfs/autofs – ECFS will act as a file-system over network/block-level raid over multiple servers.
id cern-2055710
institution Organización Europea para la Investigación Nuclear
publishDate 2014
record_format invenio
spelling cern-20557102022-08-17T13:32:46Zdoi:10.1088/1742-6596/513/4/042038http://cds.cern.ch/record/2055710Rybczynski, TomaszBonaccorsi, EnricoNeufeld, NikoECFS: A decentralized, distributed and fault-tolerant FUSE filesystem for the LHCb online farmComputing and ComputersThe LHCb experiment records millions of proton collisions every second, but only a fraction of them are useful for LHCb physics. In order to filter out the 'bad events' a large farm of x86-servers (~2000 nodes) has been put in place. These servers boot from and run from NFS, however they use their local disk to temporarily store data, which cannot be processed in real-time ('data-deferring'). These events are subsequently processed, when there are no live-data coming in. The effective CPU power is thus greatly increased. This gain in CPU power depends critically on the availability of the local disks. For cost and power-reasons, mirroring (RAID-1) is not used, leading to a lot of operational headache with failing disks and disk-errors or server failures induced by faulty disks. To mitigate these problems and increase the reliability of the LHCb farm, while at same time keeping cost and power-consumption low, an extensive research and study of existing highly available and distributed file systems has been done. While many distributed file systems are providing reliability by 'file replication', none of the evaluated ones supports erasure algorithms. A decentralised, distributed and fault-tolerant 'write once read many' file system has been designed and implemented as a proof of concept providing fault tolerance without using expensive – in terms of disk space – file replication techniques and providing a unique namespace as a main goals. This paper describes the design and the implementation of the Erasure Codes File System (ECFS) and presents the specialised FUSE interface for Linux. Depending on the encoding algorithm ECFS will use a certain number of target directories as a backend to store the segments that compose the encoded data. When target directories are mounted via nfs/autofs – ECFS will act as a file-system over network/block-level raid over multiple servers.oai:cds.cern.ch:20557102014
spellingShingle Computing and Computers
Rybczynski, Tomasz
Bonaccorsi, Enrico
Neufeld, Niko
ECFS: A decentralized, distributed and fault-tolerant FUSE filesystem for the LHCb online farm
title ECFS: A decentralized, distributed and fault-tolerant FUSE filesystem for the LHCb online farm
title_full ECFS: A decentralized, distributed and fault-tolerant FUSE filesystem for the LHCb online farm
title_fullStr ECFS: A decentralized, distributed and fault-tolerant FUSE filesystem for the LHCb online farm
title_full_unstemmed ECFS: A decentralized, distributed and fault-tolerant FUSE filesystem for the LHCb online farm
title_short ECFS: A decentralized, distributed and fault-tolerant FUSE filesystem for the LHCb online farm
title_sort ecfs: a decentralized, distributed and fault-tolerant fuse filesystem for the lhcb online farm
topic Computing and Computers
url https://dx.doi.org/10.1088/1742-6596/513/4/042038
http://cds.cern.ch/record/2055710
work_keys_str_mv AT rybczynskitomasz ecfsadecentralizeddistributedandfaulttolerantfusefilesystemforthelhcbonlinefarm
AT bonaccorsienrico ecfsadecentralizeddistributedandfaulttolerantfusefilesystemforthelhcbonlinefarm
AT neufeldniko ecfsadecentralizeddistributedandfaulttolerantfusefilesystemforthelhcbonlinefarm