Cargando…

Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing

The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the...

Descripción completa

Detalles Bibliográficos
Autores principales: Klimentov, A, Buncic, P, De, K, Jha, S, Maeno, T, Mount, R, Nilsson, P, Oleynik, D, Panitkin, S, Petrosyan, A, Porter, R J, Read, K F, Vaniachine, A, Wells, J C, Wenaus, T
Lenguaje:eng
Publicado: 2015
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/608/1/012040
http://cds.cern.ch/record/2159063
_version_ 1780950826130341888
author Klimentov, A
Buncic, P
De, K
Jha, S
Maeno, T
Mount, R
Nilsson, P
Oleynik, D
Panitkin, S
Petrosyan, A
Porter, R J
Read, K F
Vaniachine, A
Wells, J C
Wenaus, T
author_facet Klimentov, A
Buncic, P
De, K
Jha, S
Maeno, T
Mount, R
Nilsson, P
Oleynik, D
Panitkin, S
Petrosyan, A
Porter, R J
Read, K F
Vaniachine, A
Wells, J C
Wenaus, T
author_sort Klimentov, A
collection CERN
description The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(10(2)) sites, O(10(5)) cores, O(10(8)) jobs per year, O(10(3)) users, and ATLAS data volume is O(10(17)) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled ‘Next Generation Workload Management and Analysis System for Big Data’ (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center 'Kurchatov Institute' together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the PanDA system. We will present our current accomplishments with running the PanDA WMS at OLCF and other supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications.
id oai-inspirehep.net-1372988
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2015
record_format invenio
spelling oai-inspirehep.net-13729882019-09-30T06:29:59Zdoi:10.1088/1742-6596/608/1/012040http://cds.cern.ch/record/2159063engKlimentov, ABuncic, PDe, KJha, SMaeno, TMount, RNilsson, POleynik, DPanitkin, SPetrosyan, APorter, R JRead, K FVaniachine, AWells, J CWenaus, TNext Generation Workload Management System For Big Data on Heterogeneous Distributed ComputingComputing and ComputersThe Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(10(2)) sites, O(10(5)) cores, O(10(8)) jobs per year, O(10(3)) users, and ATLAS data volume is O(10(17)) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled ‘Next Generation Workload Management and Analysis System for Big Data’ (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center 'Kurchatov Institute' together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the PanDA system. We will present our current accomplishments with running the PanDA WMS at OLCF and other supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications.oai:inspirehep.net:13729882015
spellingShingle Computing and Computers
Klimentov, A
Buncic, P
De, K
Jha, S
Maeno, T
Mount, R
Nilsson, P
Oleynik, D
Panitkin, S
Petrosyan, A
Porter, R J
Read, K F
Vaniachine, A
Wells, J C
Wenaus, T
Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing
title Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing
title_full Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing
title_fullStr Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing
title_full_unstemmed Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing
title_short Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing
title_sort next generation workload management system for big data on heterogeneous distributed computing
topic Computing and Computers
url https://dx.doi.org/10.1088/1742-6596/608/1/012040
http://cds.cern.ch/record/2159063
work_keys_str_mv AT klimentova nextgenerationworkloadmanagementsystemforbigdataonheterogeneousdistributedcomputing
AT buncicp nextgenerationworkloadmanagementsystemforbigdataonheterogeneousdistributedcomputing
AT dek nextgenerationworkloadmanagementsystemforbigdataonheterogeneousdistributedcomputing
AT jhas nextgenerationworkloadmanagementsystemforbigdataonheterogeneousdistributedcomputing
AT maenot nextgenerationworkloadmanagementsystemforbigdataonheterogeneousdistributedcomputing
AT mountr nextgenerationworkloadmanagementsystemforbigdataonheterogeneousdistributedcomputing
AT nilssonp nextgenerationworkloadmanagementsystemforbigdataonheterogeneousdistributedcomputing
AT oleynikd nextgenerationworkloadmanagementsystemforbigdataonheterogeneousdistributedcomputing
AT panitkins nextgenerationworkloadmanagementsystemforbigdataonheterogeneousdistributedcomputing
AT petrosyana nextgenerationworkloadmanagementsystemforbigdataonheterogeneousdistributedcomputing
AT porterrj nextgenerationworkloadmanagementsystemforbigdataonheterogeneousdistributedcomputing
AT readkf nextgenerationworkloadmanagementsystemforbigdataonheterogeneousdistributedcomputing
AT vaniachinea nextgenerationworkloadmanagementsystemforbigdataonheterogeneousdistributedcomputing
AT wellsjc nextgenerationworkloadmanagementsystemforbigdataonheterogeneousdistributedcomputing
AT wenaust nextgenerationworkloadmanagementsystemforbigdataonheterogeneousdistributedcomputing