Cargando…

CMS Workflow Execution using Intelligent Job Scheduling and Data Access Strategies

Complex scientific workflows can process large amounts of data using thousands of tasks. The turnaround times of these workflows are often affected by various latencies such as the resource discovery, scheduling and data access latencies for the individual workflow processes or actors. Minimizing th...

Descripción completa

Detalles Bibliográficos
Autores principales: Hasham, Khawar, Delgado Peris, Antonio, Anjum, Ashiq, Evans, Dave, Hufnagel, Dirk, Huedo, Eduardo, Hernandez, Jose M., McClatchey, Richard, Gowdy, Stephen, Metson, Simon
Lenguaje:eng
Publicado: 2010
Materias:
Acceso en línea:https://dx.doi.org/10.1109/TNS.2011.2146276
http://cds.cern.ch/record/1306542
_version_ 1780921155685711872
author Hasham, Khawar
Delgado Peris, Antonio
Anjum, Ashiq
Evans, Dave
Hufnagel, Dirk
Huedo, Eduardo
Hernandez, Jose M.
McClatchey, Richard
Gowdy, Stephen
Metson, Simon
author_facet Hasham, Khawar
Delgado Peris, Antonio
Anjum, Ashiq
Evans, Dave
Hufnagel, Dirk
Huedo, Eduardo
Hernandez, Jose M.
McClatchey, Richard
Gowdy, Stephen
Metson, Simon
author_sort Hasham, Khawar
collection CERN
description Complex scientific workflows can process large amounts of data using thousands of tasks. The turnaround times of these workflows are often affected by various latencies such as the resource discovery, scheduling and data access latencies for the individual workflow processes or actors. Minimizing these latencies will improve the overall execution time of a workflow and thus lead to a more efficient and robust processing environment. In this paper, we propose a pilot job based infrastructure that has intelligent data reuse and job execution strategies to minimize the scheduling, queuing, execution and data access latencies. The results have shown that significant improvements in the overall turnaround time of a workflow can be achieved with this approach. The proposed approach has been evaluated, first using the CMS Tier0 data processing workflow, and then simulating the workflows to evaluate its effectiveness in a controlled environment.
id cern-1306542
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2010
record_format invenio
spelling cern-13065422023-03-14T17:29:14Zdoi:10.1109/TNS.2011.2146276http://cds.cern.ch/record/1306542engHasham, KhawarDelgado Peris, AntonioAnjum, AshiqEvans, DaveHufnagel, DirkHuedo, EduardoHernandez, Jose M.McClatchey, RichardGowdy, StephenMetson, SimonCMS Workflow Execution using Intelligent Job Scheduling and Data Access StrategiesDetectors and Experimental TechniquesComplex scientific workflows can process large amounts of data using thousands of tasks. The turnaround times of these workflows are often affected by various latencies such as the resource discovery, scheduling and data access latencies for the individual workflow processes or actors. Minimizing these latencies will improve the overall execution time of a workflow and thus lead to a more efficient and robust processing environment. In this paper, we propose a pilot job based infrastructure that has intelligent data reuse and job execution strategies to minimize the scheduling, queuing, execution and data access latencies. The results have shown that significant improvements in the overall turnaround time of a workflow can be achieved with this approach. The proposed approach has been evaluated, first using the CMS Tier0 data processing workflow, and then simulating the workflows to evaluate its effectiveness in a controlled environment.Complex scientific workflows can process large amounts of data using thousands of tasks. The turnaround times of these workflows are often affected by various latencies such as the resource discovery, scheduling and data access latencies for the individual workflow processes or actors. Minimizing these latencies will improve the overall execution time of a workflow and thus lead to a more efficient and robust processing environment. In this paper, we propose a pilot job based infrastructure that has intelligent data reuse and job execution strategies to minimize the scheduling, queuing, execution and data access latencies. The results have shown that significant improvements in the overall turnaround time of a workflow can be achieved with this approach. The proposed approach has been evaluated, first using the CMS Tier0 data processing workflow, and then simulating the workflows to evaluate its effectiveness in a controlled environment.arXiv:1202.5480CMS-NOTE-2010-014CERN-CMS-NOTE-2010-014FERMILAB-PUB-10-717-CMSoai:cds.cern.ch:13065422010-09-28
spellingShingle Detectors and Experimental Techniques
Hasham, Khawar
Delgado Peris, Antonio
Anjum, Ashiq
Evans, Dave
Hufnagel, Dirk
Huedo, Eduardo
Hernandez, Jose M.
McClatchey, Richard
Gowdy, Stephen
Metson, Simon
CMS Workflow Execution using Intelligent Job Scheduling and Data Access Strategies
title CMS Workflow Execution using Intelligent Job Scheduling and Data Access Strategies
title_full CMS Workflow Execution using Intelligent Job Scheduling and Data Access Strategies
title_fullStr CMS Workflow Execution using Intelligent Job Scheduling and Data Access Strategies
title_full_unstemmed CMS Workflow Execution using Intelligent Job Scheduling and Data Access Strategies
title_short CMS Workflow Execution using Intelligent Job Scheduling and Data Access Strategies
title_sort cms workflow execution using intelligent job scheduling and data access strategies
topic Detectors and Experimental Techniques
url https://dx.doi.org/10.1109/TNS.2011.2146276
http://cds.cern.ch/record/1306542
work_keys_str_mv AT hashamkhawar cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies
AT delgadoperisantonio cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies
AT anjumashiq cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies
AT evansdave cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies
AT hufnageldirk cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies
AT huedoeduardo cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies
AT hernandezjosem cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies
AT mcclatcheyrichard cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies
AT gowdystephen cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies
AT metsonsimon cmsworkflowexecutionusingintelligentjobschedulinganddataaccessstrategies