Cargando…

Providing on-demand interactive notebook-based environments with transparent access to cloud storage and specialised hardware through the INFN Cloud Platform

<!--HTML-->The Italian National Institute for Nuclear Physics (INFN) has a long history of designing and implementing large-scale computing infrastructures and applications. INFN has spent the past ten years heavily investing in developing solutions to enable, optimise and simplify transparen...

Descripción completa

Detalles Bibliográficos
Autor principal: Ciangottini, Diego
Lenguaje:eng
Publicado: 2023
Materias:
Acceso en línea:http://cds.cern.ch/record/2855644
_version_ 1780977470302846976
author Ciangottini, Diego
author_facet Ciangottini, Diego
author_sort Ciangottini, Diego
collection CERN
description <!--HTML-->The Italian National Institute for Nuclear Physics (INFN) has a long history of designing and implementing large-scale computing infrastructures and applications. INFN has spent the past ten years heavily investing in developing solutions to enable, optimise and simplify transparent access to a multi-site federated Cloud infrastructure. A primary goal of this effort is to provide a generic model that allows INFN and other users to access resources in a fair and simple manner, regardless of the complexity of their requirements, of their proximity to a powerful computing centre, or their ability to administer advanced resources such as those offering GPUs. The ultimate objective is to shorten both the “time-to-market” and the learning curve for deploying, managing, and utilising computing services on a federated cloud system. For this purpose, INFN Cloud provides a rich set of compute and storage services that can be automatically deployed on geographically distributed sites in an easy and transparent way. One of the most frequently requested services by members of different scientific communities is based on jupyter notebooks. Therefore, we have been adapting the standard JupyterHub setup to provide a flexible and extensible multi-user service with some key integrations. First of all, the authentication mechanism is based on OpenID-Connect, while the authorization is based on OAuth attributes (like the user’s subject and groups) to grant admin or regular permissions. JupyterLab instances are spawned in containers which may start from custom images that encapsulate the needed libraries, depending on users’ needs (i.e. experiment software, big data analytics tools, etc.). All the containers mount two different types of storage space: a local area, where data is stored on the node filesystem; and a remote storage area, that allows to access the INFN Cloud Object Storage via posix. Files (notebooks, data, etc.) saved in the local storage area can persist until the node hosting the notebook servers is up and running, whereas data saved in the cloud area can be accessed at any time either through the notebook, or through the web interface of the INFN Cloud Object Storage service. The usage of GPUs is also supported for running compute-intensive workloads. The automated configuration has also been tested with partitioned A100 GPUs: in this case, each notebook container gets an available partition of the GPU. This contribution will provide details about the implementation of the service and some example use-cases running on INFN Cloud.
id cern-2855644
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2023
record_format invenio
spelling cern-28556442023-04-06T20:32:33Zhttp://cds.cern.ch/record/2855644engCiangottini, DiegoProviding on-demand interactive notebook-based environments with transparent access to cloud storage and specialised hardware through the INFN Cloud PlatformCS3 2023 - Cloud Storage Synchronization and SharingHEP Computing<!--HTML-->The Italian National Institute for Nuclear Physics (INFN) has a long history of designing and implementing large-scale computing infrastructures and applications. INFN has spent the past ten years heavily investing in developing solutions to enable, optimise and simplify transparent access to a multi-site federated Cloud infrastructure. A primary goal of this effort is to provide a generic model that allows INFN and other users to access resources in a fair and simple manner, regardless of the complexity of their requirements, of their proximity to a powerful computing centre, or their ability to administer advanced resources such as those offering GPUs. The ultimate objective is to shorten both the “time-to-market” and the learning curve for deploying, managing, and utilising computing services on a federated cloud system. For this purpose, INFN Cloud provides a rich set of compute and storage services that can be automatically deployed on geographically distributed sites in an easy and transparent way. One of the most frequently requested services by members of different scientific communities is based on jupyter notebooks. Therefore, we have been adapting the standard JupyterHub setup to provide a flexible and extensible multi-user service with some key integrations. First of all, the authentication mechanism is based on OpenID-Connect, while the authorization is based on OAuth attributes (like the user’s subject and groups) to grant admin or regular permissions. JupyterLab instances are spawned in containers which may start from custom images that encapsulate the needed libraries, depending on users’ needs (i.e. experiment software, big data analytics tools, etc.). All the containers mount two different types of storage space: a local area, where data is stored on the node filesystem; and a remote storage area, that allows to access the INFN Cloud Object Storage via posix. Files (notebooks, data, etc.) saved in the local storage area can persist until the node hosting the notebook servers is up and running, whereas data saved in the cloud area can be accessed at any time either through the notebook, or through the web interface of the INFN Cloud Object Storage service. The usage of GPUs is also supported for running compute-intensive workloads. The automated configuration has also been tested with partitioned A100 GPUs: in this case, each notebook container gets an available partition of the GPU. This contribution will provide details about the implementation of the service and some example use-cases running on INFN Cloud.oai:cds.cern.ch:28556442023
spellingShingle HEP Computing
Ciangottini, Diego
Providing on-demand interactive notebook-based environments with transparent access to cloud storage and specialised hardware through the INFN Cloud Platform
title Providing on-demand interactive notebook-based environments with transparent access to cloud storage and specialised hardware through the INFN Cloud Platform
title_full Providing on-demand interactive notebook-based environments with transparent access to cloud storage and specialised hardware through the INFN Cloud Platform
title_fullStr Providing on-demand interactive notebook-based environments with transparent access to cloud storage and specialised hardware through the INFN Cloud Platform
title_full_unstemmed Providing on-demand interactive notebook-based environments with transparent access to cloud storage and specialised hardware through the INFN Cloud Platform
title_short Providing on-demand interactive notebook-based environments with transparent access to cloud storage and specialised hardware through the INFN Cloud Platform
title_sort providing on-demand interactive notebook-based environments with transparent access to cloud storage and specialised hardware through the infn cloud platform
topic HEP Computing
url http://cds.cern.ch/record/2855644
work_keys_str_mv AT ciangottinidiego providingondemandinteractivenotebookbasedenvironmentswithtransparentaccesstocloudstorageandspecialisedhardwarethroughtheinfncloudplatform
AT ciangottinidiego cs32023cloudstoragesynchronizationandsharing