Cargando…

Fast distributed compilation and testing of large C++ projects

High energy physics experiments traditionally have large software codebases primarily written in C++ and the LHCb physics software stack is no exception. Compiling from scratch can easily take 5 hours or more for the full stack even on an 8-core VM. In a development workflow, incremental builds ofte...

Descripción completa

Detalles Bibliográficos
Autor principal: Matev, Rosen
Lenguaje:eng
Publicado: 2020
Materias:
Acceso en línea:https://dx.doi.org/10.1051/epjconf/202024505001
http://cds.cern.ch/record/2757342
_version_ 1780969976419581952
author Matev, Rosen
author_facet Matev, Rosen
author_sort Matev, Rosen
collection CERN
description High energy physics experiments traditionally have large software codebases primarily written in C++ and the LHCb physics software stack is no exception. Compiling from scratch can easily take 5 hours or more for the full stack even on an 8-core VM. In a development workflow, incremental builds often do not significantly speed up compilation because even just a change of the modification time of a widely used header leads to many compiler and linker invokations. Using powerful shared servers is not practical as users have no control and maintenance is an issue. Even though support for building partial checkouts on top of published project versions exists, by far the most practical development workflow involves full project checkouts because of off-the-shelf tool support (git, intellisense, etc.)This paper details a deployment of distcc, a distributed compilation server, on opportunistic resources such as development machines. The best performance operation mode is achieved when preprocessing remotely and profiting from the shared CernVM File System. A 10 (30) fold speedup of elapsed (real) time is achieved when compiling Gaudi, the base of the LHCb stack, when comparing local compilation on a 4 core VM to remote compilation on 80 cores, where the bottleneck becomes non-distributed work such as linking. Compilation results are cached locally using ccache, allowing for even faster rebuilding. A recent distributed memcached-based shared cache is tested as well as a more modern distributed system by Mozilla, sccache, backed by S3 storage. These allow for global sharing of compilation work, which can speed up both central CI builds and local development builds. Finally, we explore remote caching and execution services based on Bazel, and how they apply to Gaudi-based software for distributing not only compilation but also linking and even testing.
id oai-inspirehep.net-1831589
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2020
record_format invenio
spelling oai-inspirehep.net-18315892021-03-22T22:08:56Zdoi:10.1051/epjconf/202024505001http://cds.cern.ch/record/2757342engMatev, RosenFast distributed compilation and testing of large C++ projectsComputing and ComputersHigh energy physics experiments traditionally have large software codebases primarily written in C++ and the LHCb physics software stack is no exception. Compiling from scratch can easily take 5 hours or more for the full stack even on an 8-core VM. In a development workflow, incremental builds often do not significantly speed up compilation because even just a change of the modification time of a widely used header leads to many compiler and linker invokations. Using powerful shared servers is not practical as users have no control and maintenance is an issue. Even though support for building partial checkouts on top of published project versions exists, by far the most practical development workflow involves full project checkouts because of off-the-shelf tool support (git, intellisense, etc.)This paper details a deployment of distcc, a distributed compilation server, on opportunistic resources such as development machines. The best performance operation mode is achieved when preprocessing remotely and profiting from the shared CernVM File System. A 10 (30) fold speedup of elapsed (real) time is achieved when compiling Gaudi, the base of the LHCb stack, when comparing local compilation on a 4 core VM to remote compilation on 80 cores, where the bottleneck becomes non-distributed work such as linking. Compilation results are cached locally using ccache, allowing for even faster rebuilding. A recent distributed memcached-based shared cache is tested as well as a more modern distributed system by Mozilla, sccache, backed by S3 storage. These allow for global sharing of compilation work, which can speed up both central CI builds and local development builds. Finally, we explore remote caching and execution services based on Bazel, and how they apply to Gaudi-based software for distributing not only compilation but also linking and even testing.oai:inspirehep.net:18315892020
spellingShingle Computing and Computers
Matev, Rosen
Fast distributed compilation and testing of large C++ projects
title Fast distributed compilation and testing of large C++ projects
title_full Fast distributed compilation and testing of large C++ projects
title_fullStr Fast distributed compilation and testing of large C++ projects
title_full_unstemmed Fast distributed compilation and testing of large C++ projects
title_short Fast distributed compilation and testing of large C++ projects
title_sort fast distributed compilation and testing of large c++ projects
topic Computing and Computers
url https://dx.doi.org/10.1051/epjconf/202024505001
http://cds.cern.ch/record/2757342
work_keys_str_mv AT matevrosen fastdistributedcompilationandtestingoflargecprojects