Cargando…

Reproducibility of Computational Experiments on Kubernetes-Managed Container Clouds with HyperFlow

We propose a comprehensive solution for reproducibility of scientific workflows. We focus particularly on Kubernetes-managed container clouds, increasingly important in scientific computing. Our solution addresses conservation of the scientific procedure, scientific data, execution environment and e...

Descripción completa

Detalles Bibliográficos
Autores principales: Orzechowski, Michał, Baliś, Bartosz, Słota, Renata G., Kitowski, Jacek
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302264/
http://dx.doi.org/10.1007/978-3-030-50371-0_16
_version_ 1783547813335924736
author Orzechowski, Michał
Baliś, Bartosz
Słota, Renata G.
Kitowski, Jacek
author_facet Orzechowski, Michał
Baliś, Bartosz
Słota, Renata G.
Kitowski, Jacek
author_sort Orzechowski, Michał
collection PubMed
description We propose a comprehensive solution for reproducibility of scientific workflows. We focus particularly on Kubernetes-managed container clouds, increasingly important in scientific computing. Our solution addresses conservation of the scientific procedure, scientific data, execution environment and experiment deployment, while using standard tools in order to avoid maintainability issues that can obstruct reproducibility. We introduce an Experiment Digital Object (EDO), a record published in an open science repository that contains artifacts required to reproduce an experiment. We demonstrate a variety of reproducibility scenarios including experiment repetition (same experiment and conditions), replication (same experiment, different conditions), and propose a smart reuse scenario in which a previous experiment is partially replayed and partially re-executed. The approach is implemented in the HyperFlow workflow management system and experimentally evaluated using a genomic scientific workflow. The experiment is published as an EDO record on the Zenodo platform.
format Online
Article
Text
id pubmed-7302264
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-73022642020-06-18 Reproducibility of Computational Experiments on Kubernetes-Managed Container Clouds with HyperFlow Orzechowski, Michał Baliś, Bartosz Słota, Renata G. Kitowski, Jacek Computational Science – ICCS 2020 Article We propose a comprehensive solution for reproducibility of scientific workflows. We focus particularly on Kubernetes-managed container clouds, increasingly important in scientific computing. Our solution addresses conservation of the scientific procedure, scientific data, execution environment and experiment deployment, while using standard tools in order to avoid maintainability issues that can obstruct reproducibility. We introduce an Experiment Digital Object (EDO), a record published in an open science repository that contains artifacts required to reproduce an experiment. We demonstrate a variety of reproducibility scenarios including experiment repetition (same experiment and conditions), replication (same experiment, different conditions), and propose a smart reuse scenario in which a previous experiment is partially replayed and partially re-executed. The approach is implemented in the HyperFlow workflow management system and experimentally evaluated using a genomic scientific workflow. The experiment is published as an EDO record on the Zenodo platform. 2020-05-26 /pmc/articles/PMC7302264/ http://dx.doi.org/10.1007/978-3-030-50371-0_16 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Orzechowski, Michał
Baliś, Bartosz
Słota, Renata G.
Kitowski, Jacek
Reproducibility of Computational Experiments on Kubernetes-Managed Container Clouds with HyperFlow
title Reproducibility of Computational Experiments on Kubernetes-Managed Container Clouds with HyperFlow
title_full Reproducibility of Computational Experiments on Kubernetes-Managed Container Clouds with HyperFlow
title_fullStr Reproducibility of Computational Experiments on Kubernetes-Managed Container Clouds with HyperFlow
title_full_unstemmed Reproducibility of Computational Experiments on Kubernetes-Managed Container Clouds with HyperFlow
title_short Reproducibility of Computational Experiments on Kubernetes-Managed Container Clouds with HyperFlow
title_sort reproducibility of computational experiments on kubernetes-managed container clouds with hyperflow
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302264/
http://dx.doi.org/10.1007/978-3-030-50371-0_16
work_keys_str_mv AT orzechowskimichał reproducibilityofcomputationalexperimentsonkubernetesmanagedcontainercloudswithhyperflow
AT balisbartosz reproducibilityofcomputationalexperimentsonkubernetesmanagedcontainercloudswithhyperflow
AT słotarenatag reproducibilityofcomputationalexperimentsonkubernetesmanagedcontainercloudswithhyperflow
AT kitowskijacek reproducibilityofcomputationalexperimentsonkubernetesmanagedcontainercloudswithhyperflow