Cargando…

Watchdog 2.0: New developments for reusability, reproducibility, and workflow execution

BACKGROUND: Advances in high-throughput methods have brought new challenges for biological data analysis, often requiring many interdependent steps applied to a large number of samples. To address this challenge, workflow management systems, such as Watchdog, have been developed to support scientist...

Descripción completa

Detalles Bibliográficos
Autores principales: Kluge, Michael, Friedl, Marie-Sophie, Menzel, Amrei L, Friedel, Caroline C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7298769/
https://www.ncbi.nlm.nih.gov/pubmed/32556167
http://dx.doi.org/10.1093/gigascience/giaa068
_version_ 1783547267709403136
author Kluge, Michael
Friedl, Marie-Sophie
Menzel, Amrei L
Friedel, Caroline C
author_facet Kluge, Michael
Friedl, Marie-Sophie
Menzel, Amrei L
Friedel, Caroline C
author_sort Kluge, Michael
collection PubMed
description BACKGROUND: Advances in high-throughput methods have brought new challenges for biological data analysis, often requiring many interdependent steps applied to a large number of samples. To address this challenge, workflow management systems, such as Watchdog, have been developed to support scientists in the (semi-)automated execution of large analysis workflows. IMPLEMENTATION: Here, we present Watchdog 2.0, which implements new developments for module creation, reusability, and documentation and for reproducibility of analyses and workflow execution. Developments include a graphical user interface for semi-automatic module creation from software help pages, sharing repositories for modules and workflows, and a standardized module documentation format. The latter allows generation of a customized reference book of public and user-specific modules. Furthermore, extensive logging of workflow execution, module and software versions, and explicit support for package managers and container virtualization now ensures reproducibility of results. A step-by-step analysis protocol generated from the log file may, e.g., serve as a draft of a manuscript methods section. Finally, 2 new execution modes were implemented. One allows resuming workflow execution after interruption or modification without rerunning successfully executed tasks not affected by changes. The second one allows detaching and reattaching to workflow execution on a local computer while tasks continue running on computer clusters. CONCLUSIONS: Watchdog 2.0 provides several new developments that we believe to be of benefit for large-scale bioinformatics analysis and that are not completely covered by other competing workflow management systems. The software itself, module and workflow repositories, and comprehensive documentation are freely available at https://www.bio.ifi.lmu.de/watchdog.
format Online
Article
Text
id pubmed-7298769
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-72987692020-06-22 Watchdog 2.0: New developments for reusability, reproducibility, and workflow execution Kluge, Michael Friedl, Marie-Sophie Menzel, Amrei L Friedel, Caroline C Gigascience Technical Note BACKGROUND: Advances in high-throughput methods have brought new challenges for biological data analysis, often requiring many interdependent steps applied to a large number of samples. To address this challenge, workflow management systems, such as Watchdog, have been developed to support scientists in the (semi-)automated execution of large analysis workflows. IMPLEMENTATION: Here, we present Watchdog 2.0, which implements new developments for module creation, reusability, and documentation and for reproducibility of analyses and workflow execution. Developments include a graphical user interface for semi-automatic module creation from software help pages, sharing repositories for modules and workflows, and a standardized module documentation format. The latter allows generation of a customized reference book of public and user-specific modules. Furthermore, extensive logging of workflow execution, module and software versions, and explicit support for package managers and container virtualization now ensures reproducibility of results. A step-by-step analysis protocol generated from the log file may, e.g., serve as a draft of a manuscript methods section. Finally, 2 new execution modes were implemented. One allows resuming workflow execution after interruption or modification without rerunning successfully executed tasks not affected by changes. The second one allows detaching and reattaching to workflow execution on a local computer while tasks continue running on computer clusters. CONCLUSIONS: Watchdog 2.0 provides several new developments that we believe to be of benefit for large-scale bioinformatics analysis and that are not completely covered by other competing workflow management systems. The software itself, module and workflow repositories, and comprehensive documentation are freely available at https://www.bio.ifi.lmu.de/watchdog. Oxford University Press 2020-06-17 /pmc/articles/PMC7298769/ /pubmed/32556167 http://dx.doi.org/10.1093/gigascience/giaa068 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Kluge, Michael
Friedl, Marie-Sophie
Menzel, Amrei L
Friedel, Caroline C
Watchdog 2.0: New developments for reusability, reproducibility, and workflow execution
title Watchdog 2.0: New developments for reusability, reproducibility, and workflow execution
title_full Watchdog 2.0: New developments for reusability, reproducibility, and workflow execution
title_fullStr Watchdog 2.0: New developments for reusability, reproducibility, and workflow execution
title_full_unstemmed Watchdog 2.0: New developments for reusability, reproducibility, and workflow execution
title_short Watchdog 2.0: New developments for reusability, reproducibility, and workflow execution
title_sort watchdog 2.0: new developments for reusability, reproducibility, and workflow execution
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7298769/
https://www.ncbi.nlm.nih.gov/pubmed/32556167
http://dx.doi.org/10.1093/gigascience/giaa068
work_keys_str_mv AT klugemichael watchdog20newdevelopmentsforreusabilityreproducibilityandworkflowexecution
AT friedlmariesophie watchdog20newdevelopmentsforreusabilityreproducibilityandworkflowexecution
AT menzelamreil watchdog20newdevelopmentsforreusabilityreproducibilityandworkflowexecution
AT friedelcarolinec watchdog20newdevelopmentsforreusabilityreproducibilityandworkflowexecution