Cargando…

Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants

Whole-genome sequencing (WGS) is a fundamental technology for research to advance precision medicine, but the limited availability of portable and user-friendly workflows for WGS analyses poses a major challenge for many research groups and hampers scientific progress. Here we present Sarek, an open...

Descripción completa

Detalles Bibliográficos
Autores principales: Garcia, Maxime, Juhos, Szilveszter, Larsson, Malin, Olason, Pall I., Martin, Marcel, Eisfeldt, Jesper, DiLorenzo, Sebastian, Sandgren, Johanna, Díaz De Ståhl, Teresita, Ewels, Philip, Wirta, Valtteri, Nistér, Monica, Käller, Max, Nystedt, Björn
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000 Research Limited 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7111497/
https://www.ncbi.nlm.nih.gov/pubmed/32269765
http://dx.doi.org/10.12688/f1000research.16665.2
_version_ 1783513301548793856
author Garcia, Maxime
Juhos, Szilveszter
Larsson, Malin
Olason, Pall I.
Martin, Marcel
Eisfeldt, Jesper
DiLorenzo, Sebastian
Sandgren, Johanna
Díaz De Ståhl, Teresita
Ewels, Philip
Wirta, Valtteri
Nistér, Monica
Käller, Max
Nystedt, Björn
author_facet Garcia, Maxime
Juhos, Szilveszter
Larsson, Malin
Olason, Pall I.
Martin, Marcel
Eisfeldt, Jesper
DiLorenzo, Sebastian
Sandgren, Johanna
Díaz De Ståhl, Teresita
Ewels, Philip
Wirta, Valtteri
Nistér, Monica
Käller, Max
Nystedt, Björn
author_sort Garcia, Maxime
collection PubMed
description Whole-genome sequencing (WGS) is a fundamental technology for research to advance precision medicine, but the limited availability of portable and user-friendly workflows for WGS analyses poses a major challenge for many research groups and hampers scientific progress. Here we present Sarek, an open-source workflow to detect germline variants and somatic mutations based on sequencing data from WGS, whole-exome sequencing (WES), or gene panels. Sarek features (i) easy installation, (ii) robust portability across different computer environments, (iii) comprehensive documentation, (iv) transparent and easy-to-read code, and (v) extensive quality metrics reporting. Sarek is implemented in the Nextflow workflow language and supports both Docker and Singularity containers as well as Conda environments, making it ideal for easy deployment on any POSIX-compatible computers and cloud compute environments. Sarek follows the GATK best-practice recommendations for read alignment and pre-processing, and includes a wide range of software for the identification and annotation of germline and somatic single-nucleotide variants, insertion and deletion variants, structural variants, tumour sample purity, and variations in ploidy and copy number. Sarek offers easy, efficient, and reproducible WGS analyses, and can readily be used both as a production workflow at sequencing facilities and as a powerful stand-alone tool for individual research groups. The Sarek source code, documentation and installation instructions are freely available at https://github.com/nf-core/sarek and at https://nf-co.re/sarek/.
format Online
Article
Text
id pubmed-7111497
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher F1000 Research Limited
record_format MEDLINE/PubMed
spelling pubmed-71114972020-04-07 Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants Garcia, Maxime Juhos, Szilveszter Larsson, Malin Olason, Pall I. Martin, Marcel Eisfeldt, Jesper DiLorenzo, Sebastian Sandgren, Johanna Díaz De Ståhl, Teresita Ewels, Philip Wirta, Valtteri Nistér, Monica Käller, Max Nystedt, Björn F1000Res Software Tool Article Whole-genome sequencing (WGS) is a fundamental technology for research to advance precision medicine, but the limited availability of portable and user-friendly workflows for WGS analyses poses a major challenge for many research groups and hampers scientific progress. Here we present Sarek, an open-source workflow to detect germline variants and somatic mutations based on sequencing data from WGS, whole-exome sequencing (WES), or gene panels. Sarek features (i) easy installation, (ii) robust portability across different computer environments, (iii) comprehensive documentation, (iv) transparent and easy-to-read code, and (v) extensive quality metrics reporting. Sarek is implemented in the Nextflow workflow language and supports both Docker and Singularity containers as well as Conda environments, making it ideal for easy deployment on any POSIX-compatible computers and cloud compute environments. Sarek follows the GATK best-practice recommendations for read alignment and pre-processing, and includes a wide range of software for the identification and annotation of germline and somatic single-nucleotide variants, insertion and deletion variants, structural variants, tumour sample purity, and variations in ploidy and copy number. Sarek offers easy, efficient, and reproducible WGS analyses, and can readily be used both as a production workflow at sequencing facilities and as a powerful stand-alone tool for individual research groups. The Sarek source code, documentation and installation instructions are freely available at https://github.com/nf-core/sarek and at https://nf-co.re/sarek/. F1000 Research Limited 2020-09-04 /pmc/articles/PMC7111497/ /pubmed/32269765 http://dx.doi.org/10.12688/f1000research.16665.2 Text en Copyright: © 2020 Garcia M et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software Tool Article
Garcia, Maxime
Juhos, Szilveszter
Larsson, Malin
Olason, Pall I.
Martin, Marcel
Eisfeldt, Jesper
DiLorenzo, Sebastian
Sandgren, Johanna
Díaz De Ståhl, Teresita
Ewels, Philip
Wirta, Valtteri
Nistér, Monica
Käller, Max
Nystedt, Björn
Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants
title Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants
title_full Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants
title_fullStr Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants
title_full_unstemmed Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants
title_short Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants
title_sort sarek: a portable workflow for whole-genome sequencing analysis of germline and somatic variants
topic Software Tool Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7111497/
https://www.ncbi.nlm.nih.gov/pubmed/32269765
http://dx.doi.org/10.12688/f1000research.16665.2
work_keys_str_mv AT garciamaxime sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants
AT juhosszilveszter sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants
AT larssonmalin sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants
AT olasonpalli sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants
AT martinmarcel sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants
AT eisfeldtjesper sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants
AT dilorenzosebastian sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants
AT sandgrenjohanna sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants
AT diazdestahlteresita sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants
AT ewelsphilip sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants
AT wirtavaltteri sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants
AT nistermonica sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants
AT kallermax sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants
AT nystedtbjorn sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants