Cargando…

Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads

BACKGROUND: Sequencing of marker genes amplified from environmental samples, known as amplicon sequencing, allows us to resolve some of the hidden diversity and elucidate evolutionary relationships and ecological processes among complex microbial communities. The analysis of large numbers of samples...

Descripción completa

Detalles Bibliográficos
Autores principales:	Welzel, Marius, Lange, Anja, Heider, Dominik, Schwarz, Michael, Freisleben, Bernd, Jensen, Manfred, Boenigk, Jens, Beisser, Daniela
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2020
Materias:	Software
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7667751/ https://www.ncbi.nlm.nih.gov/pubmed/33198651 http://dx.doi.org/10.1186/s12859-020-03852-4

_version_	1783610374831996928
author	Welzel, Marius Lange, Anja Heider, Dominik Schwarz, Michael Freisleben, Bernd Jensen, Manfred Boenigk, Jens Beisser, Daniela
author_facet	Welzel, Marius Lange, Anja Heider, Dominik Schwarz, Michael Freisleben, Bernd Jensen, Manfred Boenigk, Jens Beisser, Daniela
author_sort	Welzel, Marius
collection	PubMed
description	BACKGROUND: Sequencing of marker genes amplified from environmental samples, known as amplicon sequencing, allows us to resolve some of the hidden diversity and elucidate evolutionary relationships and ecological processes among complex microbial communities. The analysis of large numbers of samples at high sequencing depths generated by high throughput sequencing technologies requires efficient, flexible, and reproducible bioinformatics pipelines. Only a few existing workflows can be run in a user-friendly, scalable, and reproducible manner on different computing devices using an efficient workflow management system. RESULTS: We present Natrix, an open-source bioinformatics workflow for preprocessing raw amplicon sequencing data. The workflow contains all analysis steps from quality assessment, read assembly, dereplication, chimera detection, split-sample merging, sequence representative assignment (OTUs or ASVs) to the taxonomic assignment of sequence representatives. The workflow is written using Snakemake, a workflow management engine for developing data analysis workflows. In addition, Conda is used for version control. Thus, Snakemake ensures reproducibility and Conda offers version control of the utilized programs. The encapsulation of rules and their dependencies support hassle-free sharing of rules between workflows and easy adaptation and extension of existing workflows. Natrix is freely available on GitHub (https://github.com/MW55/Natrix) or as a Docker container on DockerHub (https://hub.docker.com/r/mw55/natrix). CONCLUSION: Natrix is a user-friendly and highly extensible workflow for processing Illumina amplicon data.
format	Online Article Text
id	pubmed-7667751
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-76677512020-11-17 Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads Welzel, Marius Lange, Anja Heider, Dominik Schwarz, Michael Freisleben, Bernd Jensen, Manfred Boenigk, Jens Beisser, Daniela BMC Bioinformatics Software BACKGROUND: Sequencing of marker genes amplified from environmental samples, known as amplicon sequencing, allows us to resolve some of the hidden diversity and elucidate evolutionary relationships and ecological processes among complex microbial communities. The analysis of large numbers of samples at high sequencing depths generated by high throughput sequencing technologies requires efficient, flexible, and reproducible bioinformatics pipelines. Only a few existing workflows can be run in a user-friendly, scalable, and reproducible manner on different computing devices using an efficient workflow management system. RESULTS: We present Natrix, an open-source bioinformatics workflow for preprocessing raw amplicon sequencing data. The workflow contains all analysis steps from quality assessment, read assembly, dereplication, chimera detection, split-sample merging, sequence representative assignment (OTUs or ASVs) to the taxonomic assignment of sequence representatives. The workflow is written using Snakemake, a workflow management engine for developing data analysis workflows. In addition, Conda is used for version control. Thus, Snakemake ensures reproducibility and Conda offers version control of the utilized programs. The encapsulation of rules and their dependencies support hassle-free sharing of rules between workflows and easy adaptation and extension of existing workflows. Natrix is freely available on GitHub (https://github.com/MW55/Natrix) or as a Docker container on DockerHub (https://hub.docker.com/r/mw55/natrix). CONCLUSION: Natrix is a user-friendly and highly extensible workflow for processing Illumina amplicon data. BioMed Central 2020-11-16 /pmc/articles/PMC7667751/ /pubmed/33198651 http://dx.doi.org/10.1186/s12859-020-03852-4 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Software Welzel, Marius Lange, Anja Heider, Dominik Schwarz, Michael Freisleben, Bernd Jensen, Manfred Boenigk, Jens Beisser, Daniela Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads
title	Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads
title_full	Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads
title_fullStr	Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads
title_full_unstemmed	Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads
title_short	Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads
title_sort	natrix: a snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads
topic	Software
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7667751/ https://www.ncbi.nlm.nih.gov/pubmed/33198651 http://dx.doi.org/10.1186/s12859-020-03852-4
work_keys_str_mv	AT welzelmarius natrixasnakemakebasedworkflowforprocessingclusteringandtaxonomicallyassigningampliconsequencingreads AT langeanja natrixasnakemakebasedworkflowforprocessingclusteringandtaxonomicallyassigningampliconsequencingreads AT heiderdominik natrixasnakemakebasedworkflowforprocessingclusteringandtaxonomicallyassigningampliconsequencingreads AT schwarzmichael natrixasnakemakebasedworkflowforprocessingclusteringandtaxonomicallyassigningampliconsequencingreads AT freislebenbernd natrixasnakemakebasedworkflowforprocessingclusteringandtaxonomicallyassigningampliconsequencingreads AT jensenmanfred natrixasnakemakebasedworkflowforprocessingclusteringandtaxonomicallyassigningampliconsequencingreads AT boenigkjens natrixasnakemakebasedworkflowforprocessingclusteringandtaxonomicallyassigningampliconsequencingreads AT beisserdaniela natrixasnakemakebasedworkflowforprocessingclusteringandtaxonomicallyassigningampliconsequencingreads

Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads

Ejemplares similares