Cargando…

Code health in EOS: Improving test infrastructure and overall service quality

During the last few years, the EOS distributed storage system at CERN has seen a steady increase in use, both in terms of traffic volume as well as sheer amount of stored data. This has brought the unwelcome side effect of stretching the EOS software stack to its design constraints, resulting in frequ...

Descripción completa

Detalles Bibliográficos
Autores principales: Sindrilaru, Elvin Alin, Bitzes, Georgios, Luchetti, Fabio, Patrascoiu, Mihai
Lenguaje:eng
Publicado: 2020
Materias:
Acceso en línea:https://dx.doi.org/10.1051/epjconf/202024505027
http://cds.cern.ch/record/2752840
_version_ 1780969330182193152
author Sindrilaru, Elvin Alin
Bitzes, Georgios
Luchetti, Fabio
Patrascoiu, Mihai
author_facet Sindrilaru, Elvin Alin
Bitzes, Georgios
Luchetti, Fabio
Patrascoiu, Mihai
author_sort Sindrilaru, Elvin Alin
collection CERN
description During the last few years, the EOS distributed storage system at CERN has seen a steady increase in use, both in terms of traffic volume as well as sheer amount of stored data. This has brought the unwelcome side effect of stretching the EOS software stack to its design constraints, resulting in frequent user-facing issues and occasional downtime of critical services. In this paper, we discuss the challenges of adapting the software to meet the increasing demands, while at the same time preserving functionality without breaking existing features or introducing new bugs. We document our efforts in modernizing and stabilizing the codebase, through the refactoring of legacy code, introduction of widespread unit testing, as well as leveraging Kubernetes to build a comprehensive test orchestration framework capable of stressing every aspect of an EOS installation, with the goal of discovering bottlenecks and instabilities before they reach production.
id oai-inspirehep.net-1832166
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2020
record_format invenio
spelling oai-inspirehep.net-18321662021-03-01T20:16:22Zdoi:10.1051/epjconf/202024505027http://cds.cern.ch/record/2752840engSindrilaru, Elvin AlinBitzes, GeorgiosLuchetti, FabioPatrascoiu, MihaiCode health in EOS: Improving test infrastructure and overall service qualityComputing and ComputersDuring the last few years, the EOS distributed storage system at CERN has seen a steady increase in use, both in terms of traffic volume as well as sheer amount of stored data. This has brought the unwelcome side effect of stretching the EOS software stack to its design constraints, resulting in frequent user-facing issues and occasional downtime of critical services. In this paper, we discuss the challenges of adapting the software to meet the increasing demands, while at the same time preserving functionality without breaking existing features or introducing new bugs. We document our efforts in modernizing and stabilizing the codebase, through the refactoring of legacy code, introduction of widespread unit testing, as well as leveraging Kubernetes to build a comprehensive test orchestration framework capable of stressing every aspect of an EOS installation, with the goal of discovering bottlenecks and instabilities before they reach production.oai:inspirehep.net:18321662020
spellingShingle Computing and Computers
Sindrilaru, Elvin Alin
Bitzes, Georgios
Luchetti, Fabio
Patrascoiu, Mihai
Code health in EOS: Improving test infrastructure and overall service quality
title Code health in EOS: Improving test infrastructure and overall service quality
title_full Code health in EOS: Improving test infrastructure and overall service quality
title_fullStr Code health in EOS: Improving test infrastructure and overall service quality
title_full_unstemmed Code health in EOS: Improving test infrastructure and overall service quality
title_short Code health in EOS: Improving test infrastructure and overall service quality
title_sort code health in eos: improving test infrastructure and overall service quality
topic Computing and Computers
url https://dx.doi.org/10.1051/epjconf/202024505027
http://cds.cern.ch/record/2752840
work_keys_str_mv AT sindrilaruelvinalin codehealthineosimprovingtestinfrastructureandoverallservicequality
AT bitzesgeorgios codehealthineosimprovingtestinfrastructureandoverallservicequality
AT luchettifabio codehealthineosimprovingtestinfrastructureandoverallservicequality
AT patrascoiumihai codehealthineosimprovingtestinfrastructureandoverallservicequality