Cargando…

Multicore in Production: Advantages and Limits of the Multiprocess Approach

The shared memory architecture of multicore CPUs provides HENP developers with the opportunity to reduce the memory footprint of their applications by sharing memory pages between the cores in a processor. ATLAS pioneered the multi-process approach to parallelizing HENP applications. Using Linux for...

Descripción completa

Detalles Bibliográficos
Autores principales: Binet, S, Calafiura, P, Lavrijsen, W, Leggett, Ch, Lesny, D, Jha, M K, Severini, H, Smith, D, Snyder, S, Tatarkhanov, M, Tsulaia, V, van Gemmeren, P, Washbrook, A
Lenguaje:eng
Publicado: 2011
Materias:
Acceso en línea:http://cds.cern.ch/record/1379474
_version_ 1780923037439229952
author Binet, S
Calafiura, P
Lavrijsen, W
Leggett, Ch
Lesny, D
Jha, M K
Severini, H
Smith, D
Snyder, S
Tatarkhanov, M
Tsulaia, V
van Gemmeren, P
Washbrook, A
author_facet Binet, S
Calafiura, P
Lavrijsen, W
Leggett, Ch
Lesny, D
Jha, M K
Severini, H
Smith, D
Snyder, S
Tatarkhanov, M
Tsulaia, V
van Gemmeren, P
Washbrook, A
author_sort Binet, S
collection CERN
description The shared memory architecture of multicore CPUs provides HENP developers with the opportunity to reduce the memory footprint of their applications by sharing memory pages between the cores in a processor. ATLAS pioneered the multi-process approach to parallelizing HENP applications. Using Linux fork() and the Copy On Write mechanism we implemented a simple event task farm which allows to share up to 50% memory pages among event worker processes with negligible CPU overhead. By leaving the task of managing shared memory pages to the operating system, we have been able to run in parallel large reconstruction and simulation applications originally written to be run in a single thread of execution with little to no change to the application code. In spite of this, the process of validating athena multi-process for production took ten months of concentrated effort and is expected to continue for several more months. In general terms, we had two classes of problems in the multi-process port: merging the output files produced by the event workers, and assuring the reproducibility of the results, especially of Montecarlo simulations, when running with different configurations, in particular with different number of event workers. Besides validating the software itself, an important and time-consuming aspect of running multicore applications in production is to configure the production system to handle multicore jobs. This entails defining multicore batch queues, where the unit resource is not a core, but a whole computing node; monitoring the output of many event workers; and adapting the job definition layer to handle computing resources with very different event throughputs (depending on the number of cores used). To conclude, we will present scalability and memory usage studies, based on data gathered both on dedicated hardware and on ATLAS production nodes. From these it should become apparent that the most promising development to improve performance will be to transition from a simple, flat, event task farm in which all processes handle events independently to a task farm with specialized worker processes, which will be in charge of event I/O. This approach will further reduce the memory footprint of our multicore applications, and at the same time address the issue of merging event worker outputs, at the cost of some increase in the complexity of the ATLAS core software.
id cern-1379474
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2011
record_format invenio
spelling cern-13794742019-09-30T06:29:59Zhttp://cds.cern.ch/record/1379474engBinet, SCalafiura, PLavrijsen, WLeggett, ChLesny, DJha, M KSeverini, HSmith, DSnyder, STatarkhanov, MTsulaia, Vvan Gemmeren, PWashbrook, AMulticore in Production: Advantages and Limits of the Multiprocess ApproachDetectors and Experimental TechniquesThe shared memory architecture of multicore CPUs provides HENP developers with the opportunity to reduce the memory footprint of their applications by sharing memory pages between the cores in a processor. ATLAS pioneered the multi-process approach to parallelizing HENP applications. Using Linux fork() and the Copy On Write mechanism we implemented a simple event task farm which allows to share up to 50% memory pages among event worker processes with negligible CPU overhead. By leaving the task of managing shared memory pages to the operating system, we have been able to run in parallel large reconstruction and simulation applications originally written to be run in a single thread of execution with little to no change to the application code. In spite of this, the process of validating athena multi-process for production took ten months of concentrated effort and is expected to continue for several more months. In general terms, we had two classes of problems in the multi-process port: merging the output files produced by the event workers, and assuring the reproducibility of the results, especially of Montecarlo simulations, when running with different configurations, in particular with different number of event workers. Besides validating the software itself, an important and time-consuming aspect of running multicore applications in production is to configure the production system to handle multicore jobs. This entails defining multicore batch queues, where the unit resource is not a core, but a whole computing node; monitoring the output of many event workers; and adapting the job definition layer to handle computing resources with very different event throughputs (depending on the number of cores used). To conclude, we will present scalability and memory usage studies, based on data gathered both on dedicated hardware and on ATLAS production nodes. From these it should become apparent that the most promising development to improve performance will be to transition from a simple, flat, event task farm in which all processes handle events independently to a task farm with specialized worker processes, which will be in charge of event I/O. This approach will further reduce the memory footprint of our multicore applications, and at the same time address the issue of merging event worker outputs, at the cost of some increase in the complexity of the ATLAS core software.ATL-SOFT-SLIDE-2011-506oai:cds.cern.ch:13794742011-09-02
spellingShingle Detectors and Experimental Techniques
Binet, S
Calafiura, P
Lavrijsen, W
Leggett, Ch
Lesny, D
Jha, M K
Severini, H
Smith, D
Snyder, S
Tatarkhanov, M
Tsulaia, V
van Gemmeren, P
Washbrook, A
Multicore in Production: Advantages and Limits of the Multiprocess Approach
title Multicore in Production: Advantages and Limits of the Multiprocess Approach
title_full Multicore in Production: Advantages and Limits of the Multiprocess Approach
title_fullStr Multicore in Production: Advantages and Limits of the Multiprocess Approach
title_full_unstemmed Multicore in Production: Advantages and Limits of the Multiprocess Approach
title_short Multicore in Production: Advantages and Limits of the Multiprocess Approach
title_sort multicore in production: advantages and limits of the multiprocess approach
topic Detectors and Experimental Techniques
url http://cds.cern.ch/record/1379474
work_keys_str_mv AT binets multicoreinproductionadvantagesandlimitsofthemultiprocessapproach
AT calafiurap multicoreinproductionadvantagesandlimitsofthemultiprocessapproach
AT lavrijsenw multicoreinproductionadvantagesandlimitsofthemultiprocessapproach
AT leggettch multicoreinproductionadvantagesandlimitsofthemultiprocessapproach
AT lesnyd multicoreinproductionadvantagesandlimitsofthemultiprocessapproach
AT jhamk multicoreinproductionadvantagesandlimitsofthemultiprocessapproach
AT severinih multicoreinproductionadvantagesandlimitsofthemultiprocessapproach
AT smithd multicoreinproductionadvantagesandlimitsofthemultiprocessapproach
AT snyders multicoreinproductionadvantagesandlimitsofthemultiprocessapproach
AT tatarkhanovm multicoreinproductionadvantagesandlimitsofthemultiprocessapproach
AT tsulaiav multicoreinproductionadvantagesandlimitsofthemultiprocessapproach
AT vangemmerenp multicoreinproductionadvantagesandlimitsofthemultiprocessapproach
AT washbrooka multicoreinproductionadvantagesandlimitsofthemultiprocessapproach