Cargando…

Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases

Computational workflows are widely used in data analysis, enabling automated tracking of steps and storage of provenance information, leading to innovation and decision-making in the scientific community. However, the growing popularity of workflows has raised concerns about reproducibility and reus...

Descripción completa

Detalles Bibliográficos
Autores principales: Kale, Amruta, Sun, Ziheng, Ma, Xiaogang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Berlin Heidelberg 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10450007/
https://www.ncbi.nlm.nih.gov/pubmed/37636584
http://dx.doi.org/10.1007/s12145-023-01045-0
_version_ 1785095095662411776
author Kale, Amruta
Sun, Ziheng
Ma, Xiaogang
author_facet Kale, Amruta
Sun, Ziheng
Ma, Xiaogang
author_sort Kale, Amruta
collection PubMed
description Computational workflows are widely used in data analysis, enabling automated tracking of steps and storage of provenance information, leading to innovation and decision-making in the scientific community. However, the growing popularity of workflows has raised concerns about reproducibility and reusability which can hinder collaboration between institutions and users. In order to address these concerns, it is important to standardize workflows or provide tools that offer a framework for describing workflows and enabling computational reusability. One such set of standards that has recently emerged is the Common Workflow Language (CWL), which offers a robust and flexible framework for data analysis tools and workflows. To promote portability, reproducibility, and interoperability of AI/ML workflows, we developed geoweaver_cwl, a Python package that automatically describes AI/ML workflows from a workflow management system (WfMS) named Geoweaver into CWL. In this paper, we test our Python package on multiple use cases from different domains. Our objective is to demonstrate and verify the utility of this package. We make all the code and dataset open online and briefly describe the experimental implementation of the package in this paper, confirming that geoweaver_cwl can lead to a well-versed AI process while disclosing opportunities for further extensions. The geoweaver_cwl package is publicly released online at https://pypi.org/project/geoweaver-cwl/0.0.1/ and exemplar results are accessible at: https://github.com/amrutakale08/geoweaver_cwl-usecases.
format Online
Article
Text
id pubmed-10450007
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Springer Berlin Heidelberg
record_format MEDLINE/PubMed
spelling pubmed-104500072023-08-26 Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases Kale, Amruta Sun, Ziheng Ma, Xiaogang Earth Sci Inform Software Computational workflows are widely used in data analysis, enabling automated tracking of steps and storage of provenance information, leading to innovation and decision-making in the scientific community. However, the growing popularity of workflows has raised concerns about reproducibility and reusability which can hinder collaboration between institutions and users. In order to address these concerns, it is important to standardize workflows or provide tools that offer a framework for describing workflows and enabling computational reusability. One such set of standards that has recently emerged is the Common Workflow Language (CWL), which offers a robust and flexible framework for data analysis tools and workflows. To promote portability, reproducibility, and interoperability of AI/ML workflows, we developed geoweaver_cwl, a Python package that automatically describes AI/ML workflows from a workflow management system (WfMS) named Geoweaver into CWL. In this paper, we test our Python package on multiple use cases from different domains. Our objective is to demonstrate and verify the utility of this package. We make all the code and dataset open online and briefly describe the experimental implementation of the package in this paper, confirming that geoweaver_cwl can lead to a well-versed AI process while disclosing opportunities for further extensions. The geoweaver_cwl package is publicly released online at https://pypi.org/project/geoweaver-cwl/0.0.1/ and exemplar results are accessible at: https://github.com/amrutakale08/geoweaver_cwl-usecases. Springer Berlin Heidelberg 2023-07-10 2023 /pmc/articles/PMC10450007/ /pubmed/37636584 http://dx.doi.org/10.1007/s12145-023-01045-0 Text en © The Author(s) 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Software
Kale, Amruta
Sun, Ziheng
Ma, Xiaogang
Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases
title Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases
title_full Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases
title_fullStr Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases
title_full_unstemmed Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases
title_short Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases
title_sort utility of the python package geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10450007/
https://www.ncbi.nlm.nih.gov/pubmed/37636584
http://dx.doi.org/10.1007/s12145-023-01045-0
work_keys_str_mv AT kaleamruta utilityofthepythonpackagegeoweavercwlforimprovingworkflowreusabilityanillustrationwithmultidisciplinaryusecases
AT sunziheng utilityofthepythonpackagegeoweavercwlforimprovingworkflowreusabilityanillustrationwithmultidisciplinaryusecases
AT maxiaogang utilityofthepythonpackagegeoweavercwlforimprovingworkflowreusabilityanillustrationwithmultidisciplinaryusecases