Cargando…

Reducing Deadline Miss Rate for Grid Workloads running in Virtual Machines: a deadline-aware and adaptive approach

This thesis explores three major areas of research; integration of virutalization into sci- entific grid infrastructures, evaluation of the virtualization overhead on HPC grid job’s performance, and optimization of job execution times to increase their throughput by reducing job deadline miss rate....

Descripción completa

Detalles Bibliográficos
Autor principal: Khalid, Omer
Lenguaje:eng
Publicado: Greenwich U. 2011
Materias:
Acceso en línea:http://cds.cern.ch/record/1379868
_version_ 1780923059563134976
author Khalid, Omer
author_facet Khalid, Omer
author_sort Khalid, Omer
collection CERN
description This thesis explores three major areas of research; integration of virutalization into sci- entific grid infrastructures, evaluation of the virtualization overhead on HPC grid job’s performance, and optimization of job execution times to increase their throughput by reducing job deadline miss rate. Integration of the virtualization into the grid to deploy on-demand virtual machines for jobs in a way that is transparent to the end users and have minimum impact on the existing system poses a significant challenge. This involves the creation of virtual machines, decompression of the operating system image, adapting the virtual environ- ment to satisfy software requirements of the job, constant update of the job state once it’s running with out modifying batch system or existing grid middleware, and finally bringing the host machine back to a consistent state. To facilitate this research, an existing and in production pilot job framework has been modified to deploy virtual machines on demand on the grid using virtualization ad- ministrative domain to handle all I/O to increase network throughput. This approach limits the change impact on the existing grid infrastructure while leveraging the exe- cution and performance isolation capabilities of virtualization for job execution. This work led to evaluation of various scheduling strategies used by the Xen hypervisor to measure the sensitivity of job performance to the amount of CPU and memory allo- cated under various configurations. However, virtualization overhead is also a critical factor in determining job execution times. Grid jobs have a diverse set of requirements for machine resources such as CPU, Memory, Network and have inter-dependencies on other jobs in meeting their dead- lines since the input of one job can be the output from the previous job. A novel re- source provisioning model was devised to decrease the impact of virtualization over- head on job execution. Finally, dynamic deadline-aware optimization algorithms were introduced using exponential smoothing and rate limiting to predict job failure rates based on static and dynamic virtualization overhead. Statistical techniques were also integrated into the optimization algorithm to flag jobs that are at risk to miss their deadlines, and taking preventive action to increase overall job throughput.
id cern-1379868
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2011
publisher Greenwich U.
record_format invenio
spelling cern-13798682019-09-30T06:29:59Zhttp://cds.cern.ch/record/1379868engKhalid, OmerReducing Deadline Miss Rate for Grid Workloads running in Virtual Machines: a deadline-aware and adaptive approachComputing and ComputersThis thesis explores three major areas of research; integration of virutalization into sci- entific grid infrastructures, evaluation of the virtualization overhead on HPC grid job’s performance, and optimization of job execution times to increase their throughput by reducing job deadline miss rate. Integration of the virtualization into the grid to deploy on-demand virtual machines for jobs in a way that is transparent to the end users and have minimum impact on the existing system poses a significant challenge. This involves the creation of virtual machines, decompression of the operating system image, adapting the virtual environ- ment to satisfy software requirements of the job, constant update of the job state once it’s running with out modifying batch system or existing grid middleware, and finally bringing the host machine back to a consistent state. To facilitate this research, an existing and in production pilot job framework has been modified to deploy virtual machines on demand on the grid using virtualization ad- ministrative domain to handle all I/O to increase network throughput. This approach limits the change impact on the existing grid infrastructure while leveraging the exe- cution and performance isolation capabilities of virtualization for job execution. This work led to evaluation of various scheduling strategies used by the Xen hypervisor to measure the sensitivity of job performance to the amount of CPU and memory allo- cated under various configurations. However, virtualization overhead is also a critical factor in determining job execution times. Grid jobs have a diverse set of requirements for machine resources such as CPU, Memory, Network and have inter-dependencies on other jobs in meeting their dead- lines since the input of one job can be the output from the previous job. A novel re- source provisioning model was devised to decrease the impact of virtualization over- head on job execution. Finally, dynamic deadline-aware optimization algorithms were introduced using exponential smoothing and rate limiting to predict job failure rates based on static and dynamic virtualization overhead. Statistical techniques were also integrated into the optimization algorithm to flag jobs that are at risk to miss their deadlines, and taking preventive action to increase overall job throughput.Greenwich U.CERN-THESIS-2011-082oai:cds.cern.ch:13798682011
spellingShingle Computing and Computers
Khalid, Omer
Reducing Deadline Miss Rate for Grid Workloads running in Virtual Machines: a deadline-aware and adaptive approach
title Reducing Deadline Miss Rate for Grid Workloads running in Virtual Machines: a deadline-aware and adaptive approach
title_full Reducing Deadline Miss Rate for Grid Workloads running in Virtual Machines: a deadline-aware and adaptive approach
title_fullStr Reducing Deadline Miss Rate for Grid Workloads running in Virtual Machines: a deadline-aware and adaptive approach
title_full_unstemmed Reducing Deadline Miss Rate for Grid Workloads running in Virtual Machines: a deadline-aware and adaptive approach
title_short Reducing Deadline Miss Rate for Grid Workloads running in Virtual Machines: a deadline-aware and adaptive approach
title_sort reducing deadline miss rate for grid workloads running in virtual machines: a deadline-aware and adaptive approach
topic Computing and Computers
url http://cds.cern.ch/record/1379868
work_keys_str_mv AT khalidomer reducingdeadlinemissrateforgridworkloadsrunninginvirtualmachinesadeadlineawareandadaptiveapproach