Cargando…
Distributed Late-binding Micro-scheduling and Data Caching for Data-Intensive Workflows
Today’s world is flooded with vast amounts of digital information coming from innumerable sources. Moreover, it seems clear that this trend will only intensify in the future. Industry, society and—remarkably—science are not indifferent to this fact. On the contrary, they are struggling to get the mo...
Autor principal: | |
---|---|
Lenguaje: | eng |
Publicado: |
2018
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2635141 |
Sumario: | Today’s world is flooded with vast amounts of digital information coming
from innumerable sources. Moreover, it seems clear that this trend will only
intensify in the future. Industry, society and—remarkably—science are not
indifferent to this fact. On the contrary, they are struggling to get the most
out of this data, which means that they need to capture, transfer, store and
process it in a timely and efficient manner, using a wide range of computational
resources. And this task is not always simple. A very representative
example of the challenges posed by the management and processing of large
quantities of data is that of the Large Hadron Collider experiments, which
handle tens of petabytes of physics information every year. Based on the
experience of one of these collaborations, we have studied the main issues
involved in the management of huge volumes of data and in the completion
of sizeable workflows that consume it.
In this context, we have developed a general-purpose architecture for
the scheduling and execution of workflows with heavy data requirements:
the Task Queue. This new system builds on the late-binding overlay model,
which has helped experiments to successfully overcome the problems associated
to the heterogeneity and complexity of large computational grids.
Our proposal introduces several enhancements to the existing systems. The
execution agents of the Task Queue architecture share a Distributed Hash
Table (DHT) and perform job matching and assignment cooperatively. In
this way, scalability problems of centralized matching algorithms are avoided
and workflow execution times are improved. Scalability makes fine-grained
micro-scheduling possible and enables new functionalities, like the implementation
of a distributed data cache on the execution nodes and the integration
of data location information in the scheduling decisions. This improves
the efficiency of data processing and helps alleviate the commonly
congested grid storage services. In addition, our system is more resilient to
problems in the central server and behaves better in scenarios with demanding
data access patterns or with no local storage service availablle, as an
extensive set of assessment tests has proven.
Since our distributed task scheduling procedure requires the use of broadcast
messages, we have also performed an exhaustive study of the possible approaches to implement this operation on top of the Kademlia DHT, which
was already used for the shared data cache. Kademlia provided individual
node routing but no broadcast primitive. Our work exposes the particularities
of this system, notably its XOR-based distance metrics, and analytically
studies which broadcasting techniques can be applied to it. A model
that estimates node coverage as a function of the probability that individual
messages reach their destination has also been developed. For validation, the
algorithms have been implemented and comprehensively evaluated. Moreover,
several techniques are proposed to enhance the bare protocols when
adverse circumstances such as churn and failure rate conditions are present.
These include redundancy, resubmissions or flooding, and also combinations
of those. An analysis of the strengths and weaknesses of algorithms and
additional techniques is presented. |
---|