Cargando…
Multicore in Production: Advantages and Limits of the Multiprocess Approach
The shared memory architecture of multicore CPUs provides HENP developers with the opportunity to reduce the memory footprint of their applications by sharing memory pages between the cores in a processor. ATLAS pioneered the multi-process approach to parallelizing HENP applications. Using Linux for...
Autores principales: | , , , , , , , , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2011
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/1379474 |
_version_ | 1780923037439229952 |
---|---|
author | Binet, S Calafiura, P Lavrijsen, W Leggett, Ch Lesny, D Jha, M K Severini, H Smith, D Snyder, S Tatarkhanov, M Tsulaia, V van Gemmeren, P Washbrook, A |
author_facet | Binet, S Calafiura, P Lavrijsen, W Leggett, Ch Lesny, D Jha, M K Severini, H Smith, D Snyder, S Tatarkhanov, M Tsulaia, V van Gemmeren, P Washbrook, A |
author_sort | Binet, S |
collection | CERN |
description | The shared memory architecture of multicore CPUs provides HENP developers with the opportunity to reduce the memory footprint of their applications by sharing memory pages between the cores in a processor. ATLAS pioneered the multi-process approach to parallelizing HENP applications. Using Linux fork() and the Copy On Write mechanism we implemented a simple event task farm which allows to share up to 50% memory pages among event worker processes with negligible CPU overhead. By leaving the task of managing shared memory pages to the operating system, we have been able to run in parallel large reconstruction and simulation applications originally written to be run in a single thread of execution with little to no change to the application code. In spite of this, the process of validating athena multi-process for production took ten months of concentrated effort and is expected to continue for several more months. In general terms, we had two classes of problems in the multi-process port: merging the output files produced by the event workers, and assuring the reproducibility of the results, especially of Montecarlo simulations, when running with different configurations, in particular with different number of event workers. Besides validating the software itself, an important and time-consuming aspect of running multicore applications in production is to configure the production system to handle multicore jobs. This entails defining multicore batch queues, where the unit resource is not a core, but a whole computing node; monitoring the output of many event workers; and adapting the job definition layer to handle computing resources with very different event throughputs (depending on the number of cores used). To conclude, we will present scalability and memory usage studies, based on data gathered both on dedicated hardware and on ATLAS production nodes. From these it should become apparent that the most promising development to improve performance will be to transition from a simple, flat, event task farm in which all processes handle events independently to a task farm with specialized worker processes, which will be in charge of event I/O. This approach will further reduce the memory footprint of our multicore applications, and at the same time address the issue of merging event worker outputs, at the cost of some increase in the complexity of the ATLAS core software. |
id | cern-1379474 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2011 |
record_format | invenio |
spelling | cern-13794742019-09-30T06:29:59Zhttp://cds.cern.ch/record/1379474engBinet, SCalafiura, PLavrijsen, WLeggett, ChLesny, DJha, M KSeverini, HSmith, DSnyder, STatarkhanov, MTsulaia, Vvan Gemmeren, PWashbrook, AMulticore in Production: Advantages and Limits of the Multiprocess ApproachDetectors and Experimental TechniquesThe shared memory architecture of multicore CPUs provides HENP developers with the opportunity to reduce the memory footprint of their applications by sharing memory pages between the cores in a processor. ATLAS pioneered the multi-process approach to parallelizing HENP applications. Using Linux fork() and the Copy On Write mechanism we implemented a simple event task farm which allows to share up to 50% memory pages among event worker processes with negligible CPU overhead. By leaving the task of managing shared memory pages to the operating system, we have been able to run in parallel large reconstruction and simulation applications originally written to be run in a single thread of execution with little to no change to the application code. In spite of this, the process of validating athena multi-process for production took ten months of concentrated effort and is expected to continue for several more months. In general terms, we had two classes of problems in the multi-process port: merging the output files produced by the event workers, and assuring the reproducibility of the results, especially of Montecarlo simulations, when running with different configurations, in particular with different number of event workers. Besides validating the software itself, an important and time-consuming aspect of running multicore applications in production is to configure the production system to handle multicore jobs. This entails defining multicore batch queues, where the unit resource is not a core, but a whole computing node; monitoring the output of many event workers; and adapting the job definition layer to handle computing resources with very different event throughputs (depending on the number of cores used). To conclude, we will present scalability and memory usage studies, based on data gathered both on dedicated hardware and on ATLAS production nodes. From these it should become apparent that the most promising development to improve performance will be to transition from a simple, flat, event task farm in which all processes handle events independently to a task farm with specialized worker processes, which will be in charge of event I/O. This approach will further reduce the memory footprint of our multicore applications, and at the same time address the issue of merging event worker outputs, at the cost of some increase in the complexity of the ATLAS core software.ATL-SOFT-SLIDE-2011-506oai:cds.cern.ch:13794742011-09-02 |
spellingShingle | Detectors and Experimental Techniques Binet, S Calafiura, P Lavrijsen, W Leggett, Ch Lesny, D Jha, M K Severini, H Smith, D Snyder, S Tatarkhanov, M Tsulaia, V van Gemmeren, P Washbrook, A Multicore in Production: Advantages and Limits of the Multiprocess Approach |
title | Multicore in Production: Advantages and Limits of the Multiprocess Approach |
title_full | Multicore in Production: Advantages and Limits of the Multiprocess Approach |
title_fullStr | Multicore in Production: Advantages and Limits of the Multiprocess Approach |
title_full_unstemmed | Multicore in Production: Advantages and Limits of the Multiprocess Approach |
title_short | Multicore in Production: Advantages and Limits of the Multiprocess Approach |
title_sort | multicore in production: advantages and limits of the multiprocess approach |
topic | Detectors and Experimental Techniques |
url | http://cds.cern.ch/record/1379474 |
work_keys_str_mv | AT binets multicoreinproductionadvantagesandlimitsofthemultiprocessapproach AT calafiurap multicoreinproductionadvantagesandlimitsofthemultiprocessapproach AT lavrijsenw multicoreinproductionadvantagesandlimitsofthemultiprocessapproach AT leggettch multicoreinproductionadvantagesandlimitsofthemultiprocessapproach AT lesnyd multicoreinproductionadvantagesandlimitsofthemultiprocessapproach AT jhamk multicoreinproductionadvantagesandlimitsofthemultiprocessapproach AT severinih multicoreinproductionadvantagesandlimitsofthemultiprocessapproach AT smithd multicoreinproductionadvantagesandlimitsofthemultiprocessapproach AT snyders multicoreinproductionadvantagesandlimitsofthemultiprocessapproach AT tatarkhanovm multicoreinproductionadvantagesandlimitsofthemultiprocessapproach AT tsulaiav multicoreinproductionadvantagesandlimitsofthemultiprocessapproach AT vangemmerenp multicoreinproductionadvantagesandlimitsofthemultiprocessapproach AT washbrooka multicoreinproductionadvantagesandlimitsofthemultiprocessapproach |