Cargando…

SUSHI: an exquisite recipe for fully documented, reproducible and reusable NGS data analysis

BACKGROUND: Next generation sequencing (NGS) produces massive datasets consisting of billions of reads and up to thousands of samples. Subsequent bioinformatic analysis is typically done with the help of open source tools, where each application performs a single step towards the final result. This...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hatakeyama, Masaomi, Opitz, Lennart, Russo, Giancarlo, Qi, Weihong, Schlapbach, Ralph, Rehrauer, Hubert
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2016
Materias:	Software
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4890512/ https://www.ncbi.nlm.nih.gov/pubmed/27255077 http://dx.doi.org/10.1186/s12859-016-1104-8

_version_	1782435119091417088
author	Hatakeyama, Masaomi Opitz, Lennart Russo, Giancarlo Qi, Weihong Schlapbach, Ralph Rehrauer, Hubert
author_facet	Hatakeyama, Masaomi Opitz, Lennart Russo, Giancarlo Qi, Weihong Schlapbach, Ralph Rehrauer, Hubert
author_sort	Hatakeyama, Masaomi
collection	PubMed
description	BACKGROUND: Next generation sequencing (NGS) produces massive datasets consisting of billions of reads and up to thousands of samples. Subsequent bioinformatic analysis is typically done with the help of open source tools, where each application performs a single step towards the final result. This situation leaves the bioinformaticians with the tasks to combine the tools, manage the data files and meta-information, document the analysis, and ensure reproducibility. RESULTS: We present SUSHI, an agile data analysis framework that relieves bioinformaticians from the administrative challenges of their data analysis. SUSHI lets users build reproducible data analysis workflows from individual applications and manages the input data, the parameters, meta-information with user-driven semantics, and the job scripts. As distinguishing features, SUSHI provides an expert command line interface as well as a convenient web interface to run bioinformatics tools. SUSHI datasets are self-contained and self-documented on the file system. This makes them fully reproducible and ready to be shared. With the associated meta-information being formatted as plain text tables, the datasets can be readily further analyzed and interpreted outside SUSHI. CONCLUSION: SUSHI provides an exquisite recipe for analysing NGS data. By following the SUSHI recipe, SUSHI makes data analysis straightforward and takes care of documentation and administration tasks. Thus, the user can fully dedicate his time to the analysis itself. SUSHI is suitable for use by bioinformaticians as well as life science researchers. It is targeted for, but by no means constrained to, NGS data analysis. Our SUSHI instance is in productive use and has served as data analysis interface for more than 1000 data analysis projects. SUSHI source code as well as a demo server are freely available. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1104-8) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-4890512
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-48905122016-06-10 SUSHI: an exquisite recipe for fully documented, reproducible and reusable NGS data analysis Hatakeyama, Masaomi Opitz, Lennart Russo, Giancarlo Qi, Weihong Schlapbach, Ralph Rehrauer, Hubert BMC Bioinformatics Software BACKGROUND: Next generation sequencing (NGS) produces massive datasets consisting of billions of reads and up to thousands of samples. Subsequent bioinformatic analysis is typically done with the help of open source tools, where each application performs a single step towards the final result. This situation leaves the bioinformaticians with the tasks to combine the tools, manage the data files and meta-information, document the analysis, and ensure reproducibility. RESULTS: We present SUSHI, an agile data analysis framework that relieves bioinformaticians from the administrative challenges of their data analysis. SUSHI lets users build reproducible data analysis workflows from individual applications and manages the input data, the parameters, meta-information with user-driven semantics, and the job scripts. As distinguishing features, SUSHI provides an expert command line interface as well as a convenient web interface to run bioinformatics tools. SUSHI datasets are self-contained and self-documented on the file system. This makes them fully reproducible and ready to be shared. With the associated meta-information being formatted as plain text tables, the datasets can be readily further analyzed and interpreted outside SUSHI. CONCLUSION: SUSHI provides an exquisite recipe for analysing NGS data. By following the SUSHI recipe, SUSHI makes data analysis straightforward and takes care of documentation and administration tasks. Thus, the user can fully dedicate his time to the analysis itself. SUSHI is suitable for use by bioinformaticians as well as life science researchers. It is targeted for, but by no means constrained to, NGS data analysis. Our SUSHI instance is in productive use and has served as data analysis interface for more than 1000 data analysis projects. SUSHI source code as well as a demo server are freely available. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1104-8) contains supplementary material, which is available to authorized users. BioMed Central 2016-06-02 /pmc/articles/PMC4890512/ /pubmed/27255077 http://dx.doi.org/10.1186/s12859-016-1104-8 Text en © Hatakeyama et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Software Hatakeyama, Masaomi Opitz, Lennart Russo, Giancarlo Qi, Weihong Schlapbach, Ralph Rehrauer, Hubert SUSHI: an exquisite recipe for fully documented, reproducible and reusable NGS data analysis
title	SUSHI: an exquisite recipe for fully documented, reproducible and reusable NGS data analysis
title_full	SUSHI: an exquisite recipe for fully documented, reproducible and reusable NGS data analysis
title_fullStr	SUSHI: an exquisite recipe for fully documented, reproducible and reusable NGS data analysis
title_full_unstemmed	SUSHI: an exquisite recipe for fully documented, reproducible and reusable NGS data analysis
title_short	SUSHI: an exquisite recipe for fully documented, reproducible and reusable NGS data analysis
title_sort	sushi: an exquisite recipe for fully documented, reproducible and reusable ngs data analysis
topic	Software
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4890512/ https://www.ncbi.nlm.nih.gov/pubmed/27255077 http://dx.doi.org/10.1186/s12859-016-1104-8
work_keys_str_mv	AT hatakeyamamasaomi sushianexquisiterecipeforfullydocumentedreproducibleandreusablengsdataanalysis AT opitzlennart sushianexquisiterecipeforfullydocumentedreproducibleandreusablengsdataanalysis AT russogiancarlo sushianexquisiterecipeforfullydocumentedreproducibleandreusablengsdataanalysis AT qiweihong sushianexquisiterecipeforfullydocumentedreproducibleandreusablengsdataanalysis AT schlapbachralph sushianexquisiterecipeforfullydocumentedreproducibleandreusablengsdataanalysis AT rehrauerhubert sushianexquisiterecipeforfullydocumentedreproducibleandreusablengsdataanalysis

SUSHI: an exquisite recipe for fully documented, reproducible and reusable NGS data analysis

Ejemplares similares