Cargando…
Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases
Computational workflows are widely used in data analysis, enabling automated tracking of steps and storage of provenance information, leading to innovation and decision-making in the scientific community. However, the growing popularity of workflows has raised concerns about reproducibility and reus...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Berlin Heidelberg
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10450007/ https://www.ncbi.nlm.nih.gov/pubmed/37636584 http://dx.doi.org/10.1007/s12145-023-01045-0 |
_version_ | 1785095095662411776 |
---|---|
author | Kale, Amruta Sun, Ziheng Ma, Xiaogang |
author_facet | Kale, Amruta Sun, Ziheng Ma, Xiaogang |
author_sort | Kale, Amruta |
collection | PubMed |
description | Computational workflows are widely used in data analysis, enabling automated tracking of steps and storage of provenance information, leading to innovation and decision-making in the scientific community. However, the growing popularity of workflows has raised concerns about reproducibility and reusability which can hinder collaboration between institutions and users. In order to address these concerns, it is important to standardize workflows or provide tools that offer a framework for describing workflows and enabling computational reusability. One such set of standards that has recently emerged is the Common Workflow Language (CWL), which offers a robust and flexible framework for data analysis tools and workflows. To promote portability, reproducibility, and interoperability of AI/ML workflows, we developed geoweaver_cwl, a Python package that automatically describes AI/ML workflows from a workflow management system (WfMS) named Geoweaver into CWL. In this paper, we test our Python package on multiple use cases from different domains. Our objective is to demonstrate and verify the utility of this package. We make all the code and dataset open online and briefly describe the experimental implementation of the package in this paper, confirming that geoweaver_cwl can lead to a well-versed AI process while disclosing opportunities for further extensions. The geoweaver_cwl package is publicly released online at https://pypi.org/project/geoweaver-cwl/0.0.1/ and exemplar results are accessible at: https://github.com/amrutakale08/geoweaver_cwl-usecases. |
format | Online Article Text |
id | pubmed-10450007 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Springer Berlin Heidelberg |
record_format | MEDLINE/PubMed |
spelling | pubmed-104500072023-08-26 Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases Kale, Amruta Sun, Ziheng Ma, Xiaogang Earth Sci Inform Software Computational workflows are widely used in data analysis, enabling automated tracking of steps and storage of provenance information, leading to innovation and decision-making in the scientific community. However, the growing popularity of workflows has raised concerns about reproducibility and reusability which can hinder collaboration between institutions and users. In order to address these concerns, it is important to standardize workflows or provide tools that offer a framework for describing workflows and enabling computational reusability. One such set of standards that has recently emerged is the Common Workflow Language (CWL), which offers a robust and flexible framework for data analysis tools and workflows. To promote portability, reproducibility, and interoperability of AI/ML workflows, we developed geoweaver_cwl, a Python package that automatically describes AI/ML workflows from a workflow management system (WfMS) named Geoweaver into CWL. In this paper, we test our Python package on multiple use cases from different domains. Our objective is to demonstrate and verify the utility of this package. We make all the code and dataset open online and briefly describe the experimental implementation of the package in this paper, confirming that geoweaver_cwl can lead to a well-versed AI process while disclosing opportunities for further extensions. The geoweaver_cwl package is publicly released online at https://pypi.org/project/geoweaver-cwl/0.0.1/ and exemplar results are accessible at: https://github.com/amrutakale08/geoweaver_cwl-usecases. Springer Berlin Heidelberg 2023-07-10 2023 /pmc/articles/PMC10450007/ /pubmed/37636584 http://dx.doi.org/10.1007/s12145-023-01045-0 Text en © The Author(s) 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Software Kale, Amruta Sun, Ziheng Ma, Xiaogang Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases |
title | Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases |
title_full | Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases |
title_fullStr | Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases |
title_full_unstemmed | Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases |
title_short | Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases |
title_sort | utility of the python package geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10450007/ https://www.ncbi.nlm.nih.gov/pubmed/37636584 http://dx.doi.org/10.1007/s12145-023-01045-0 |
work_keys_str_mv | AT kaleamruta utilityofthepythonpackagegeoweavercwlforimprovingworkflowreusabilityanillustrationwithmultidisciplinaryusecases AT sunziheng utilityofthepythonpackagegeoweavercwlforimprovingworkflowreusabilityanillustrationwithmultidisciplinaryusecases AT maxiaogang utilityofthepythonpackagegeoweavercwlforimprovingworkflowreusabilityanillustrationwithmultidisciplinaryusecases |