Cargando…
Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants
Whole-genome sequencing (WGS) is a fundamental technology for research to advance precision medicine, but the limited availability of portable and user-friendly workflows for WGS analyses poses a major challenge for many research groups and hampers scientific progress. Here we present Sarek, an open...
Autores principales: | , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
F1000 Research Limited
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7111497/ https://www.ncbi.nlm.nih.gov/pubmed/32269765 http://dx.doi.org/10.12688/f1000research.16665.2 |
_version_ | 1783513301548793856 |
---|---|
author | Garcia, Maxime Juhos, Szilveszter Larsson, Malin Olason, Pall I. Martin, Marcel Eisfeldt, Jesper DiLorenzo, Sebastian Sandgren, Johanna Díaz De Ståhl, Teresita Ewels, Philip Wirta, Valtteri Nistér, Monica Käller, Max Nystedt, Björn |
author_facet | Garcia, Maxime Juhos, Szilveszter Larsson, Malin Olason, Pall I. Martin, Marcel Eisfeldt, Jesper DiLorenzo, Sebastian Sandgren, Johanna Díaz De Ståhl, Teresita Ewels, Philip Wirta, Valtteri Nistér, Monica Käller, Max Nystedt, Björn |
author_sort | Garcia, Maxime |
collection | PubMed |
description | Whole-genome sequencing (WGS) is a fundamental technology for research to advance precision medicine, but the limited availability of portable and user-friendly workflows for WGS analyses poses a major challenge for many research groups and hampers scientific progress. Here we present Sarek, an open-source workflow to detect germline variants and somatic mutations based on sequencing data from WGS, whole-exome sequencing (WES), or gene panels. Sarek features (i) easy installation, (ii) robust portability across different computer environments, (iii) comprehensive documentation, (iv) transparent and easy-to-read code, and (v) extensive quality metrics reporting. Sarek is implemented in the Nextflow workflow language and supports both Docker and Singularity containers as well as Conda environments, making it ideal for easy deployment on any POSIX-compatible computers and cloud compute environments. Sarek follows the GATK best-practice recommendations for read alignment and pre-processing, and includes a wide range of software for the identification and annotation of germline and somatic single-nucleotide variants, insertion and deletion variants, structural variants, tumour sample purity, and variations in ploidy and copy number. Sarek offers easy, efficient, and reproducible WGS analyses, and can readily be used both as a production workflow at sequencing facilities and as a powerful stand-alone tool for individual research groups. The Sarek source code, documentation and installation instructions are freely available at https://github.com/nf-core/sarek and at https://nf-co.re/sarek/. |
format | Online Article Text |
id | pubmed-7111497 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | F1000 Research Limited |
record_format | MEDLINE/PubMed |
spelling | pubmed-71114972020-04-07 Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants Garcia, Maxime Juhos, Szilveszter Larsson, Malin Olason, Pall I. Martin, Marcel Eisfeldt, Jesper DiLorenzo, Sebastian Sandgren, Johanna Díaz De Ståhl, Teresita Ewels, Philip Wirta, Valtteri Nistér, Monica Käller, Max Nystedt, Björn F1000Res Software Tool Article Whole-genome sequencing (WGS) is a fundamental technology for research to advance precision medicine, but the limited availability of portable and user-friendly workflows for WGS analyses poses a major challenge for many research groups and hampers scientific progress. Here we present Sarek, an open-source workflow to detect germline variants and somatic mutations based on sequencing data from WGS, whole-exome sequencing (WES), or gene panels. Sarek features (i) easy installation, (ii) robust portability across different computer environments, (iii) comprehensive documentation, (iv) transparent and easy-to-read code, and (v) extensive quality metrics reporting. Sarek is implemented in the Nextflow workflow language and supports both Docker and Singularity containers as well as Conda environments, making it ideal for easy deployment on any POSIX-compatible computers and cloud compute environments. Sarek follows the GATK best-practice recommendations for read alignment and pre-processing, and includes a wide range of software for the identification and annotation of germline and somatic single-nucleotide variants, insertion and deletion variants, structural variants, tumour sample purity, and variations in ploidy and copy number. Sarek offers easy, efficient, and reproducible WGS analyses, and can readily be used both as a production workflow at sequencing facilities and as a powerful stand-alone tool for individual research groups. The Sarek source code, documentation and installation instructions are freely available at https://github.com/nf-core/sarek and at https://nf-co.re/sarek/. F1000 Research Limited 2020-09-04 /pmc/articles/PMC7111497/ /pubmed/32269765 http://dx.doi.org/10.12688/f1000research.16665.2 Text en Copyright: © 2020 Garcia M et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Tool Article Garcia, Maxime Juhos, Szilveszter Larsson, Malin Olason, Pall I. Martin, Marcel Eisfeldt, Jesper DiLorenzo, Sebastian Sandgren, Johanna Díaz De Ståhl, Teresita Ewels, Philip Wirta, Valtteri Nistér, Monica Käller, Max Nystedt, Björn Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants |
title | Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants |
title_full | Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants |
title_fullStr | Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants |
title_full_unstemmed | Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants |
title_short | Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants |
title_sort | sarek: a portable workflow for whole-genome sequencing analysis of germline and somatic variants |
topic | Software Tool Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7111497/ https://www.ncbi.nlm.nih.gov/pubmed/32269765 http://dx.doi.org/10.12688/f1000research.16665.2 |
work_keys_str_mv | AT garciamaxime sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants AT juhosszilveszter sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants AT larssonmalin sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants AT olasonpalli sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants AT martinmarcel sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants AT eisfeldtjesper sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants AT dilorenzosebastian sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants AT sandgrenjohanna sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants AT diazdestahlteresita sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants AT ewelsphilip sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants AT wirtavaltteri sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants AT nistermonica sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants AT kallermax sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants AT nystedtbjorn sarekaportableworkflowforwholegenomesequencinganalysisofgermlineandsomaticvariants |