Cargando…
JupyterLab+ScienceMesh: Collaborative Data Science in sync-and-share environment.
<!--HTML-->Collaborative Data Science becomes increasingly important, as organizations continue to become more data-driven, and Data Science projects/models become more complex. In the report **Critical Capabilities for Data Science and Machine Learning Platforms** (March 2021) Gartner predict...
Autor principal: | |
---|---|
Lenguaje: | eng |
Publicado: |
2022
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2802275 |
_version_ | 1780972737967161344 |
---|---|
author | Sieprawski, Marcin |
author_facet | Sieprawski, Marcin |
author_sort | Sieprawski, Marcin |
collection | CERN |
description | <!--HTML-->Collaborative Data Science becomes increasingly important, as organizations continue to become more data-driven, and Data Science projects/models become more complex. In the report **Critical Capabilities for Data Science and Machine Learning Platforms** (March 2021) Gartner predicts, that in near future collective intelligence in Data Science and cloud-based AI infrastructure will be among key factors for competitive advantage.
This talk presents Distributed Data Science environments (part of [ScienceMesh](https://sciencemesh.io/)), which allow collaboration on [Jupyter Notebooks](https://jupyter.org/) in sync-and-share environment.
Jupyter Notebook has become No1 platform used by data scientists to build interactive applications and to work with big data and AI. It is widely used in CS3 institutions, many successful applications have been presented in CS3 conferences.
ScienceMesh, developed in [CS3MESH4EOSC](https://cs3mesh4eosc.eu/) project, creates the Federated Scientific Mesh providing federated sharing of data across different sync-and-share services, federated use of applications (such as collaborative document editing, data archiving, and data publishing), fast transfer of large datasets and remote data analysis (Data Science environments).
For Data Science environments ScienceMesh delivers a [JupyterLab](https://jupyterlab.readthedocs.io/) extension, integrating JupyterLab environment with ScienceMesh. File browsing and additional share and collaboration functionalities for notebooks and resources across federated cloud are now possible in JupyterLab environment. JupyterLab is considered a complete, full-fledged IDE for Data Science tasks and interactive computing, where data scientists can do all their work in one tool, so the point is that functionalities for sharing (full [cs3apis](https://github.com/cs3org/cs3apis) client) and concurrent editing are available inside this environment. On the other hand, Data Science environments are integrated with a comprehensive suite of Data Services in ScienceMesh, to support complete research and Data Science workflows with the use of existing collaboration tools.
The relevance and benefits of ScienceMesh Data Science Environments will be discussed in the context of two scientific use cases ([High Energy Physics](https://cs3mesh4eosc.eu/use-cases/data-science-environments-high-energy-physics-cern) and [Earth Observation](https://cs3mesh4eosc.eu/use-cases/data-science-environments-monitoring-land-degradation)), along with various business-related scenarios. |
id | cern-2802275 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2022 |
record_format | invenio |
spelling | cern-28022752022-11-02T22:04:02Zhttp://cds.cern.ch/record/2802275engSieprawski, MarcinJupyterLab+ScienceMesh: Collaborative Data Science in sync-and-share environment.CS3 2022 - Cloud Storage Synchronization and SharingHEP Computing<!--HTML-->Collaborative Data Science becomes increasingly important, as organizations continue to become more data-driven, and Data Science projects/models become more complex. In the report **Critical Capabilities for Data Science and Machine Learning Platforms** (March 2021) Gartner predicts, that in near future collective intelligence in Data Science and cloud-based AI infrastructure will be among key factors for competitive advantage. This talk presents Distributed Data Science environments (part of [ScienceMesh](https://sciencemesh.io/)), which allow collaboration on [Jupyter Notebooks](https://jupyter.org/) in sync-and-share environment. Jupyter Notebook has become No1 platform used by data scientists to build interactive applications and to work with big data and AI. It is widely used in CS3 institutions, many successful applications have been presented in CS3 conferences. ScienceMesh, developed in [CS3MESH4EOSC](https://cs3mesh4eosc.eu/) project, creates the Federated Scientific Mesh providing federated sharing of data across different sync-and-share services, federated use of applications (such as collaborative document editing, data archiving, and data publishing), fast transfer of large datasets and remote data analysis (Data Science environments). For Data Science environments ScienceMesh delivers a [JupyterLab](https://jupyterlab.readthedocs.io/) extension, integrating JupyterLab environment with ScienceMesh. File browsing and additional share and collaboration functionalities for notebooks and resources across federated cloud are now possible in JupyterLab environment. JupyterLab is considered a complete, full-fledged IDE for Data Science tasks and interactive computing, where data scientists can do all their work in one tool, so the point is that functionalities for sharing (full [cs3apis](https://github.com/cs3org/cs3apis) client) and concurrent editing are available inside this environment. On the other hand, Data Science environments are integrated with a comprehensive suite of Data Services in ScienceMesh, to support complete research and Data Science workflows with the use of existing collaboration tools. The relevance and benefits of ScienceMesh Data Science Environments will be discussed in the context of two scientific use cases ([High Energy Physics](https://cs3mesh4eosc.eu/use-cases/data-science-environments-high-energy-physics-cern) and [Earth Observation](https://cs3mesh4eosc.eu/use-cases/data-science-environments-monitoring-land-degradation)), along with various business-related scenarios.oai:cds.cern.ch:28022752022 |
spellingShingle | HEP Computing Sieprawski, Marcin JupyterLab+ScienceMesh: Collaborative Data Science in sync-and-share environment. |
title | JupyterLab+ScienceMesh: Collaborative Data Science in sync-and-share environment. |
title_full | JupyterLab+ScienceMesh: Collaborative Data Science in sync-and-share environment. |
title_fullStr | JupyterLab+ScienceMesh: Collaborative Data Science in sync-and-share environment. |
title_full_unstemmed | JupyterLab+ScienceMesh: Collaborative Data Science in sync-and-share environment. |
title_short | JupyterLab+ScienceMesh: Collaborative Data Science in sync-and-share environment. |
title_sort | jupyterlab+sciencemesh: collaborative data science in sync-and-share environment. |
topic | HEP Computing |
url | http://cds.cern.ch/record/2802275 |
work_keys_str_mv | AT sieprawskimarcin jupyterlabsciencemeshcollaborativedatascienceinsyncandshareenvironment AT sieprawskimarcin cs32022cloudstoragesynchronizationandsharing |