Cargando…

Watchdog – a workflow management system for the distributed analysis of large-scale experimental data

BACKGROUND: The development of high-throughput experimental technologies, such as next-generation sequencing, have led to new challenges for handling, analyzing and integrating the resulting large and diverse datasets. Bioinformatical analysis of these data commonly requires a number of mutually dep...

Descripción completa

Detalles Bibliográficos
Autores principales: Kluge, Michael, Friedel, Caroline C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5850912/
https://www.ncbi.nlm.nih.gov/pubmed/29534677
http://dx.doi.org/10.1186/s12859-018-2107-4
_version_ 1783306306991423488
author Kluge, Michael
Friedel, Caroline C.
author_facet Kluge, Michael
Friedel, Caroline C.
author_sort Kluge, Michael
collection PubMed
description BACKGROUND: The development of high-throughput experimental technologies, such as next-generation sequencing, have led to new challenges for handling, analyzing and integrating the resulting large and diverse datasets. Bioinformatical analysis of these data commonly requires a number of mutually dependent steps applied to numerous samples for multiple conditions and replicates. To support these analyses, a number of workflow management systems (WMSs) have been developed to allow automated execution of corresponding analysis workflows. Major advantages of WMSs are the easy reproducibility of results as well as the reusability of workflows or their components. RESULTS: In this article, we present Watchdog, a WMS for the automated analysis of large-scale experimental data. Main features include straightforward processing of replicate data, support for distributed computer systems, customizable error detection and manual intervention into workflow execution. Watchdog is implemented in Java and thus platform-independent and allows easy sharing of workflows and corresponding program modules. It provides a graphical user interface (GUI) for workflow construction using pre-defined modules as well as a helper script for creating new module definitions. Execution of workflows is possible using either the GUI or a command-line interface and a web-interface is provided for monitoring the execution status and intervening in case of errors. To illustrate its potentials on a real-life example, a comprehensive workflow and modules for the analysis of RNA-seq experiments were implemented and are provided with the software in addition to simple test examples. CONCLUSIONS: Watchdog is a powerful and flexible WMS for the analysis of large-scale high-throughput experiments. We believe it will greatly benefit both users with and without programming skills who want to develop and apply bioinformatical workflows with reasonable overhead. The software, example workflows and a comprehensive documentation are freely available at www.bio.ifi.lmu.de/watchdog. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2107-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5850912
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-58509122018-03-21 Watchdog – a workflow management system for the distributed analysis of large-scale experimental data Kluge, Michael Friedel, Caroline C. BMC Bioinformatics Software BACKGROUND: The development of high-throughput experimental technologies, such as next-generation sequencing, have led to new challenges for handling, analyzing and integrating the resulting large and diverse datasets. Bioinformatical analysis of these data commonly requires a number of mutually dependent steps applied to numerous samples for multiple conditions and replicates. To support these analyses, a number of workflow management systems (WMSs) have been developed to allow automated execution of corresponding analysis workflows. Major advantages of WMSs are the easy reproducibility of results as well as the reusability of workflows or their components. RESULTS: In this article, we present Watchdog, a WMS for the automated analysis of large-scale experimental data. Main features include straightforward processing of replicate data, support for distributed computer systems, customizable error detection and manual intervention into workflow execution. Watchdog is implemented in Java and thus platform-independent and allows easy sharing of workflows and corresponding program modules. It provides a graphical user interface (GUI) for workflow construction using pre-defined modules as well as a helper script for creating new module definitions. Execution of workflows is possible using either the GUI or a command-line interface and a web-interface is provided for monitoring the execution status and intervening in case of errors. To illustrate its potentials on a real-life example, a comprehensive workflow and modules for the analysis of RNA-seq experiments were implemented and are provided with the software in addition to simple test examples. CONCLUSIONS: Watchdog is a powerful and flexible WMS for the analysis of large-scale high-throughput experiments. We believe it will greatly benefit both users with and without programming skills who want to develop and apply bioinformatical workflows with reasonable overhead. The software, example workflows and a comprehensive documentation are freely available at www.bio.ifi.lmu.de/watchdog. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2107-4) contains supplementary material, which is available to authorized users. BioMed Central 2018-03-13 /pmc/articles/PMC5850912/ /pubmed/29534677 http://dx.doi.org/10.1186/s12859-018-2107-4 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Kluge, Michael
Friedel, Caroline C.
Watchdog – a workflow management system for the distributed analysis of large-scale experimental data
title Watchdog – a workflow management system for the distributed analysis of large-scale experimental data
title_full Watchdog – a workflow management system for the distributed analysis of large-scale experimental data
title_fullStr Watchdog – a workflow management system for the distributed analysis of large-scale experimental data
title_full_unstemmed Watchdog – a workflow management system for the distributed analysis of large-scale experimental data
title_short Watchdog – a workflow management system for the distributed analysis of large-scale experimental data
title_sort watchdog – a workflow management system for the distributed analysis of large-scale experimental data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5850912/
https://www.ncbi.nlm.nih.gov/pubmed/29534677
http://dx.doi.org/10.1186/s12859-018-2107-4
work_keys_str_mv AT klugemichael watchdogaworkflowmanagementsystemforthedistributedanalysisoflargescaleexperimentaldata
AT friedelcarolinec watchdogaworkflowmanagementsystemforthedistributedanalysisoflargescaleexperimentaldata