Cargando…
Simplifying the development of portable, scalable, and reproducible workflows
Command-line software plays a critical role in biology research. However, processes for installing and executing software differ widely. The Common Workflow Language (CWL) is a community standard that addresses this problem. Using CWL, tool developers can formally describe a tool’s inputs, outputs,...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
eLife Sciences Publications, Ltd
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8514239/ https://www.ncbi.nlm.nih.gov/pubmed/34643507 http://dx.doi.org/10.7554/eLife.71069 |
_version_ | 1784583346207064064 |
---|---|
author | Piccolo, Stephen R Ence, Zachary E Anderson, Elizabeth C Chang, Jeffrey T Bild, Andrea H |
author_facet | Piccolo, Stephen R Ence, Zachary E Anderson, Elizabeth C Chang, Jeffrey T Bild, Andrea H |
author_sort | Piccolo, Stephen R |
collection | PubMed |
description | Command-line software plays a critical role in biology research. However, processes for installing and executing software differ widely. The Common Workflow Language (CWL) is a community standard that addresses this problem. Using CWL, tool developers can formally describe a tool’s inputs, outputs, and other execution details. CWL documents can include instructions for executing tools inside software containers. Accordingly, CWL tools are portable—they can be executed on diverse computers—including personal workstations, high-performance clusters, or the cloud. CWL also supports workflows, which describe dependencies among tools and using outputs from one tool as inputs to others. To date, CWL has been used primarily for batch processing of large datasets, especially in genomics. But it can also be used for analytical steps of a study. This article explains key concepts about CWL and software containers and provides examples for using CWL in biology research. CWL documents are text-based, so they can be created manually, without computer programming. However, ensuring that these documents conform to the CWL specification may prevent some users from adopting it. To address this gap, we created ToolJig, a Web application that enables researchers to create CWL documents interactively. ToolJig validates information provided by the user to ensure it is complete and valid. After creating a CWL tool or workflow, the user can create ‘input-object’ files, which store values for a particular invocation of a tool or workflow. In addition, ToolJig provides examples of how to execute the tool or workflow via a workflow engine. ToolJig and our examples are available at https://github.com/srp33/ToolJig. |
format | Online Article Text |
id | pubmed-8514239 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | eLife Sciences Publications, Ltd |
record_format | MEDLINE/PubMed |
spelling | pubmed-85142392021-10-15 Simplifying the development of portable, scalable, and reproducible workflows Piccolo, Stephen R Ence, Zachary E Anderson, Elizabeth C Chang, Jeffrey T Bild, Andrea H eLife Computational and Systems Biology Command-line software plays a critical role in biology research. However, processes for installing and executing software differ widely. The Common Workflow Language (CWL) is a community standard that addresses this problem. Using CWL, tool developers can formally describe a tool’s inputs, outputs, and other execution details. CWL documents can include instructions for executing tools inside software containers. Accordingly, CWL tools are portable—they can be executed on diverse computers—including personal workstations, high-performance clusters, or the cloud. CWL also supports workflows, which describe dependencies among tools and using outputs from one tool as inputs to others. To date, CWL has been used primarily for batch processing of large datasets, especially in genomics. But it can also be used for analytical steps of a study. This article explains key concepts about CWL and software containers and provides examples for using CWL in biology research. CWL documents are text-based, so they can be created manually, without computer programming. However, ensuring that these documents conform to the CWL specification may prevent some users from adopting it. To address this gap, we created ToolJig, a Web application that enables researchers to create CWL documents interactively. ToolJig validates information provided by the user to ensure it is complete and valid. After creating a CWL tool or workflow, the user can create ‘input-object’ files, which store values for a particular invocation of a tool or workflow. In addition, ToolJig provides examples of how to execute the tool or workflow via a workflow engine. ToolJig and our examples are available at https://github.com/srp33/ToolJig. eLife Sciences Publications, Ltd 2021-10-13 /pmc/articles/PMC8514239/ /pubmed/34643507 http://dx.doi.org/10.7554/eLife.71069 Text en © 2021, Piccolo et al https://creativecommons.org/licenses/by/4.0/This article is distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use and redistribution provided that the original author and source are credited. |
spellingShingle | Computational and Systems Biology Piccolo, Stephen R Ence, Zachary E Anderson, Elizabeth C Chang, Jeffrey T Bild, Andrea H Simplifying the development of portable, scalable, and reproducible workflows |
title | Simplifying the development of portable, scalable, and reproducible workflows |
title_full | Simplifying the development of portable, scalable, and reproducible workflows |
title_fullStr | Simplifying the development of portable, scalable, and reproducible workflows |
title_full_unstemmed | Simplifying the development of portable, scalable, and reproducible workflows |
title_short | Simplifying the development of portable, scalable, and reproducible workflows |
title_sort | simplifying the development of portable, scalable, and reproducible workflows |
topic | Computational and Systems Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8514239/ https://www.ncbi.nlm.nih.gov/pubmed/34643507 http://dx.doi.org/10.7554/eLife.71069 |
work_keys_str_mv | AT piccolostephenr simplifyingthedevelopmentofportablescalableandreproducibleworkflows AT encezacharye simplifyingthedevelopmentofportablescalableandreproducibleworkflows AT andersonelizabethc simplifyingthedevelopmentofportablescalableandreproducibleworkflows AT changjeffreyt simplifyingthedevelopmentofportablescalableandreproducibleworkflows AT bildandreah simplifyingthedevelopmentofportablescalableandreproducibleworkflows |