Cargando…
Studies on dynamic load balancing in a distributed system acquiring logically related data from multiple sources, with particular emphasis on load metric and communication
The proposed method is designed for a data acquisition system acquiring data from n independent sources. The data sources are supposed to produce fragments that together constitute some logical wholeness. These fragments are produced with the same frequency and in the same sequence. The discussed al...
Autores principales: | , , |
---|---|
Lenguaje: | eng |
Publicado: |
2010
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/1328340 |
Sumario: | The proposed method is designed for a data acquisition system acquiring data from n independent sources. The data sources are supposed to produce fragments that together constitute some logical wholeness. These fragments are produced with the same frequency and in the same sequence. The discussed algorithm aims to balance the data dynamically between m logically autonomous processing units (consisting of computing nodes) in case of variation in their processing power which could be caused by some faults like failing computing nodes, or broken network connections.
As a case study we consider the Data Acquisition System of the Compact Muon Solenoid Experiment at CERN new Large Hadron Collider. The system acquires data from about 500 sources and combines them into full events. Each data source is expected to deliver event fragments of an average size of 2 kB with 100 kHz frequency.
In this paper we present the results of applying proposed load metric and load communication pattern. Moreover, we discuss their impact on the algorithm overall efficiency and scalability, as well as on fault tolerance of the whole system. We also propose a general concept of an algorithm that allows for choosing the destination processing unit in all source nodes asynchronously and asserts that all fragments of same logical data always go to same unit. |
---|