Cargando…

Distributed Late-binding Micro-scheduling and Data Caching for Data-Intensive Workflows

Today’s world is flooded with vast amounts of digital information coming from innumerable sources. Moreover, it seems clear that this trend will only intensify in the future. Industry, society and—remarkably—science are not indifferent to this fact. On the contrary, they are struggling to get the mo...

Descripción completa

Detalles Bibliográficos
Autor principal: Delgado Peris, Antonio
Lenguaje:eng
Publicado: 2018
Materias:
Acceso en línea:http://cds.cern.ch/record/2635141
Descripción
Sumario:Today’s world is flooded with vast amounts of digital information coming from innumerable sources. Moreover, it seems clear that this trend will only intensify in the future. Industry, society and—remarkably—science are not indifferent to this fact. On the contrary, they are struggling to get the most out of this data, which means that they need to capture, transfer, store and process it in a timely and efficient manner, using a wide range of computational resources. And this task is not always simple. A very representative example of the challenges posed by the management and processing of large quantities of data is that of the Large Hadron Collider experiments, which handle tens of petabytes of physics information every year. Based on the experience of one of these collaborations, we have studied the main issues involved in the management of huge volumes of data and in the completion of sizeable workflows that consume it. In this context, we have developed a general-purpose architecture for the scheduling and execution of workflows with heavy data requirements: the Task Queue. This new system builds on the late-binding overlay model, which has helped experiments to successfully overcome the problems associated to the heterogeneity and complexity of large computational grids. Our proposal introduces several enhancements to the existing systems. The execution agents of the Task Queue architecture share a Distributed Hash Table (DHT) and perform job matching and assignment cooperatively. In this way, scalability problems of centralized matching algorithms are avoided and workflow execution times are improved. Scalability makes fine-grained micro-scheduling possible and enables new functionalities, like the implementation of a distributed data cache on the execution nodes and the integration of data location information in the scheduling decisions. This improves the efficiency of data processing and helps alleviate the commonly congested grid storage services. In addition, our system is more resilient to problems in the central server and behaves better in scenarios with demanding data access patterns or with no local storage service availablle, as an extensive set of assessment tests has proven. Since our distributed task scheduling procedure requires the use of broadcast messages, we have also performed an exhaustive study of the possible approaches to implement this operation on top of the Kademlia DHT, which was already used for the shared data cache. Kademlia provided individual node routing but no broadcast primitive. Our work exposes the particularities of this system, notably its XOR-based distance metrics, and analytically studies which broadcasting techniques can be applied to it. A model that estimates node coverage as a function of the probability that individual messages reach their destination has also been developed. For validation, the algorithms have been implemented and comprehensively evaluated. Moreover, several techniques are proposed to enhance the bare protocols when adverse circumstances such as churn and failure rate conditions are present. These include redundancy, resubmissions or flooding, and also combinations of those. An analysis of the strengths and weaknesses of algorithms and additional techniques is presented.