Cargando…

Script of Scripts: A pragmatic workflow system for daily computational research

Computationally intensive disciplines such as computational biology often require use of a variety of tools implemented in different scripting languages and analysis of large data sets using high-performance computing systems. Although scientific workflow systems can powerfully organize and execute...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Gao, Peng, Bo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6411228/
https://www.ncbi.nlm.nih.gov/pubmed/30811390
http://dx.doi.org/10.1371/journal.pcbi.1006843
_version_ 1783402360477843456
author Wang, Gao
Peng, Bo
author_facet Wang, Gao
Peng, Bo
author_sort Wang, Gao
collection PubMed
description Computationally intensive disciplines such as computational biology often require use of a variety of tools implemented in different scripting languages and analysis of large data sets using high-performance computing systems. Although scientific workflow systems can powerfully organize and execute large-scale data-analysis processes, creating and maintaining such workflows usually comes with nontrivial learning curves and engineering overhead, making them cumbersome to use for everyday data exploration and prototyping. To bridge the gap between interactive analysis and workflow systems, we developed Script of Scripts (SoS), an interactive data-analysis platform and workflow system with a strong emphasis on readability, practicality, and reproducibility in daily computational research. For exploratory analysis, SoS has a multilanguage scripting format that centralizes otherwise-scattered scripts and creates dynamic reports for publication and sharing. As a workflow engine, SoS provides an intuitive syntax for creating workflows in process-oriented, outcome-oriented, and mixed styles, as well as a unified interface for executing and managing tasks on a variety of computing platforms with automatic synchronization of files among isolated file systems. As illustrated herein by real-world examples, SoS is both an interactive analysis tool and pipeline platform suitable for different stages of method development and data-analysis projects. In particular, SoS can be easily adopted in existing data analysis routines to substantially improve organization, readability, and cross-platform computation management of research projects.
format Online
Article
Text
id pubmed-6411228
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-64112282019-04-02 Script of Scripts: A pragmatic workflow system for daily computational research Wang, Gao Peng, Bo PLoS Comput Biol Research Article Computationally intensive disciplines such as computational biology often require use of a variety of tools implemented in different scripting languages and analysis of large data sets using high-performance computing systems. Although scientific workflow systems can powerfully organize and execute large-scale data-analysis processes, creating and maintaining such workflows usually comes with nontrivial learning curves and engineering overhead, making them cumbersome to use for everyday data exploration and prototyping. To bridge the gap between interactive analysis and workflow systems, we developed Script of Scripts (SoS), an interactive data-analysis platform and workflow system with a strong emphasis on readability, practicality, and reproducibility in daily computational research. For exploratory analysis, SoS has a multilanguage scripting format that centralizes otherwise-scattered scripts and creates dynamic reports for publication and sharing. As a workflow engine, SoS provides an intuitive syntax for creating workflows in process-oriented, outcome-oriented, and mixed styles, as well as a unified interface for executing and managing tasks on a variety of computing platforms with automatic synchronization of files among isolated file systems. As illustrated herein by real-world examples, SoS is both an interactive analysis tool and pipeline platform suitable for different stages of method development and data-analysis projects. In particular, SoS can be easily adopted in existing data analysis routines to substantially improve organization, readability, and cross-platform computation management of research projects. Public Library of Science 2019-02-27 /pmc/articles/PMC6411228/ /pubmed/30811390 http://dx.doi.org/10.1371/journal.pcbi.1006843 Text en © 2019 Wang, Peng http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Wang, Gao
Peng, Bo
Script of Scripts: A pragmatic workflow system for daily computational research
title Script of Scripts: A pragmatic workflow system for daily computational research
title_full Script of Scripts: A pragmatic workflow system for daily computational research
title_fullStr Script of Scripts: A pragmatic workflow system for daily computational research
title_full_unstemmed Script of Scripts: A pragmatic workflow system for daily computational research
title_short Script of Scripts: A pragmatic workflow system for daily computational research
title_sort script of scripts: a pragmatic workflow system for daily computational research
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6411228/
https://www.ncbi.nlm.nih.gov/pubmed/30811390
http://dx.doi.org/10.1371/journal.pcbi.1006843
work_keys_str_mv AT wanggao scriptofscriptsapragmaticworkflowsystemfordailycomputationalresearch
AT pengbo scriptofscriptsapragmaticworkflowsystemfordailycomputationalresearch