Cargando…
Reproducibility of Computational Experiments on Kubernetes-Managed Container Clouds with HyperFlow
We propose a comprehensive solution for reproducibility of scientific workflows. We focus particularly on Kubernetes-managed container clouds, increasingly important in scientific computing. Our solution addresses conservation of the scientific procedure, scientific data, execution environment and e...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302264/ http://dx.doi.org/10.1007/978-3-030-50371-0_16 |
_version_ | 1783547813335924736 |
---|---|
author | Orzechowski, Michał Baliś, Bartosz Słota, Renata G. Kitowski, Jacek |
author_facet | Orzechowski, Michał Baliś, Bartosz Słota, Renata G. Kitowski, Jacek |
author_sort | Orzechowski, Michał |
collection | PubMed |
description | We propose a comprehensive solution for reproducibility of scientific workflows. We focus particularly on Kubernetes-managed container clouds, increasingly important in scientific computing. Our solution addresses conservation of the scientific procedure, scientific data, execution environment and experiment deployment, while using standard tools in order to avoid maintainability issues that can obstruct reproducibility. We introduce an Experiment Digital Object (EDO), a record published in an open science repository that contains artifacts required to reproduce an experiment. We demonstrate a variety of reproducibility scenarios including experiment repetition (same experiment and conditions), replication (same experiment, different conditions), and propose a smart reuse scenario in which a previous experiment is partially replayed and partially re-executed. The approach is implemented in the HyperFlow workflow management system and experimentally evaluated using a genomic scientific workflow. The experiment is published as an EDO record on the Zenodo platform. |
format | Online Article Text |
id | pubmed-7302264 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-73022642020-06-18 Reproducibility of Computational Experiments on Kubernetes-Managed Container Clouds with HyperFlow Orzechowski, Michał Baliś, Bartosz Słota, Renata G. Kitowski, Jacek Computational Science – ICCS 2020 Article We propose a comprehensive solution for reproducibility of scientific workflows. We focus particularly on Kubernetes-managed container clouds, increasingly important in scientific computing. Our solution addresses conservation of the scientific procedure, scientific data, execution environment and experiment deployment, while using standard tools in order to avoid maintainability issues that can obstruct reproducibility. We introduce an Experiment Digital Object (EDO), a record published in an open science repository that contains artifacts required to reproduce an experiment. We demonstrate a variety of reproducibility scenarios including experiment repetition (same experiment and conditions), replication (same experiment, different conditions), and propose a smart reuse scenario in which a previous experiment is partially replayed and partially re-executed. The approach is implemented in the HyperFlow workflow management system and experimentally evaluated using a genomic scientific workflow. The experiment is published as an EDO record on the Zenodo platform. 2020-05-26 /pmc/articles/PMC7302264/ http://dx.doi.org/10.1007/978-3-030-50371-0_16 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Orzechowski, Michał Baliś, Bartosz Słota, Renata G. Kitowski, Jacek Reproducibility of Computational Experiments on Kubernetes-Managed Container Clouds with HyperFlow |
title | Reproducibility of Computational Experiments on Kubernetes-Managed Container Clouds with HyperFlow |
title_full | Reproducibility of Computational Experiments on Kubernetes-Managed Container Clouds with HyperFlow |
title_fullStr | Reproducibility of Computational Experiments on Kubernetes-Managed Container Clouds with HyperFlow |
title_full_unstemmed | Reproducibility of Computational Experiments on Kubernetes-Managed Container Clouds with HyperFlow |
title_short | Reproducibility of Computational Experiments on Kubernetes-Managed Container Clouds with HyperFlow |
title_sort | reproducibility of computational experiments on kubernetes-managed container clouds with hyperflow |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302264/ http://dx.doi.org/10.1007/978-3-030-50371-0_16 |
work_keys_str_mv | AT orzechowskimichał reproducibilityofcomputationalexperimentsonkubernetesmanagedcontainercloudswithhyperflow AT balisbartosz reproducibilityofcomputationalexperimentsonkubernetesmanagedcontainercloudswithhyperflow AT słotarenatag reproducibilityofcomputationalexperimentsonkubernetesmanagedcontainercloudswithhyperflow AT kitowskijacek reproducibilityofcomputationalexperimentsonkubernetesmanagedcontainercloudswithhyperflow |