Cargando…
CMS Workflow Execution using Intelligent Job Scheduling and Data Access Strategies
Complex scientific workflows can process large amounts of data using thousands of tasks. The turnaround times of these workflows are often affected by various latencies such as the resource discovery, scheduling and data access latencies for the individual workflow processes or actors. Minimizing th...
Autores principales: | , , , , , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2010
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1109/TNS.2011.2146276 http://cds.cern.ch/record/1306542 |
_version_ | 1780921155685711872 |
---|---|
author | Hasham, Khawar Delgado Peris, Antonio Anjum, Ashiq Evans, Dave Hufnagel, Dirk Huedo, Eduardo Hernandez, Jose M. McClatchey, Richard Gowdy, Stephen Metson, Simon |
author_facet | Hasham, Khawar Delgado Peris, Antonio Anjum, Ashiq Evans, Dave Hufnagel, Dirk Huedo, Eduardo Hernandez, Jose M. McClatchey, Richard Gowdy, Stephen Metson, Simon |
author_sort | Hasham, Khawar |
collection | CERN |
description | Complex scientific workflows can process large amounts of data using thousands
of tasks. The turnaround times of these workflows are often affected by various
latencies such as the resource discovery, scheduling and data access latencies
for the individual workflow processes or actors. Minimizing these latencies will
improve the overall execution time of a workflow and thus lead to a more
efficient and robust processing environment. In this paper, we propose a pilot
job based infrastructure that has intelligent data reuse and job execution
strategies to minimize the scheduling, queuing, execution and data access
latencies. The results have shown that significant improvements in the overall
turnaround time of a workflow can be achieved with this approach. The proposed
approach has been evaluated, first using the CMS Tier0 data processing workflow,
and then simulating the workflows to evaluate its effectiveness in a controlled
environment. |
id | cern-1306542 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2010 |
record_format | invenio |
spelling | cern-13065422023-03-14T17:29:14Zdoi:10.1109/TNS.2011.2146276http://cds.cern.ch/record/1306542engHasham, KhawarDelgado Peris, AntonioAnjum, AshiqEvans, DaveHufnagel, DirkHuedo, EduardoHernandez, Jose M.McClatchey, RichardGowdy, StephenMetson, SimonCMS Workflow Execution using Intelligent Job Scheduling and Data Access StrategiesDetectors and Experimental TechniquesComplex scientific workflows can process large amounts of data using thousands of tasks. The turnaround times of these workflows are often affected by various latencies such as the resource discovery, scheduling and data access latencies for the individual workflow processes or actors. Minimizing these latencies will improve the overall execution time of a workflow and thus lead to a more efficient and robust processing environment. In this paper, we propose a pilot job based infrastructure that has intelligent data reuse and job execution strategies to minimize the scheduling, queuing, execution and data access latencies. The results have shown that significant improvements in the overall turnaround time of a workflow can be achieved with this approach. The proposed approach has been evaluated, first using the CMS Tier0 data processing workflow, and then simulating the workflows to evaluate its effectiveness in a controlled environment.Complex scientific workflows can process large amounts of data using thousands of tasks. The turnaround times of these workflows are often affected by various latencies such as the resource discovery, scheduling and data access latencies for the individual workflow processes or actors. Minimizing these latencies will improve the overall execution time of a workflow and thus lead to a more efficient and robust processing environment. In this paper, we propose a pilot job based infrastructure that has intelligent data reuse and job execution strategies to minimize the scheduling, queuing, execution and data access latencies. The results have shown that significant improvements in the overall turnaround time of a workflow can be achieved with this approach. The proposed approach has been evaluated, first using the CMS Tier0 data processing workflow, and then simulating the workflows to evaluate its effectiveness in a controlled environment.arXiv:1202.5480CMS-NOTE-2010-014CERN-CMS-NOTE-2010-014FERMILAB-PUB-10-717-CMSoai:cds.cern.ch:13065422010-09-28 |
spellingShingle | Detectors and Experimental Techniques Hasham, Khawar Delgado Peris, Antonio Anjum, Ashiq Evans, Dave Hufnagel, Dirk Huedo, Eduardo Hernandez, Jose M. McClatchey, Richard Gowdy, Stephen Metson, Simon CMS Workflow Execution using Intelligent Job Scheduling and Data Access Strategies |
title | CMS Workflow Execution using Intelligent Job Scheduling and Data Access Strategies |
title_full | CMS Workflow Execution using Intelligent Job Scheduling and Data Access Strategies |
title_fullStr | CMS Workflow Execution using Intelligent Job Scheduling and Data Access Strategies |
title_full_unstemmed | CMS Workflow Execution using Intelligent Job Scheduling and Data Access Strategies |
title_short | CMS Workflow Execution using Intelligent Job Scheduling and Data Access Strategies |
title_sort | cms workflow execution using intelligent job scheduling and data access strategies |
topic | Detectors and Experimental Techniques |
url | https://dx.doi.org/10.1109/TNS.2011.2146276 http://cds.cern.ch/record/1306542 |
work_keys_str_mv | AT hashamkhawar cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies AT delgadoperisantonio cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies AT anjumashiq cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies AT evansdave cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies AT hufnageldirk cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies AT huedoeduardo cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies AT hernandezjosem cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies AT mcclatcheyrichard cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies AT gowdystephen cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies AT metsonsimon cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies |