Cargando…

SAMSA2: a standalone metatranscriptome analysis pipeline

BACKGROUND: Complex microbial communities are an area of growing interest in biology. Metatranscriptomics allows researchers to quantify microbial gene expression in an environmental sample via high-throughput sequencing. Metatranscriptomic experiments are computationally intensive because the exper...

Descripción completa

Detalles Bibliográficos
Autores principales: Westreich, Samuel T., Treiber, Michelle L., Mills, David A., Korf, Ian, Lemay, Danielle G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5963165/
https://www.ncbi.nlm.nih.gov/pubmed/29783945
http://dx.doi.org/10.1186/s12859-018-2189-z
_version_ 1783325005364330496
author Westreich, Samuel T.
Treiber, Michelle L.
Mills, David A.
Korf, Ian
Lemay, Danielle G.
author_facet Westreich, Samuel T.
Treiber, Michelle L.
Mills, David A.
Korf, Ian
Lemay, Danielle G.
author_sort Westreich, Samuel T.
collection PubMed
description BACKGROUND: Complex microbial communities are an area of growing interest in biology. Metatranscriptomics allows researchers to quantify microbial gene expression in an environmental sample via high-throughput sequencing. Metatranscriptomic experiments are computationally intensive because the experiments generate a large volume of sequence data and each sequence must be compared with reference sequences from thousands of organisms. RESULTS: SAMSA2 is an upgrade to the original Simple Annotation of Metatranscriptomes by Sequence Analysis (SAMSA) pipeline that has been redesigned for standalone use on a supercomputing cluster. SAMSA2 is faster due to the use of the DIAMOND aligner, and more flexible and reproducible because it uses local databases. SAMSA2 is available with detailed documentation, and example input and output files along with examples of master scripts for full pipeline execution. CONCLUSIONS: SAMSA2 is a rapid and efficient metatranscriptome pipeline for analyzing large RNA-seq datasets in a supercomputing cluster environment. SAMSA2 provides simplified output that can be examined directly or used for further analyses, and its reference databases may be upgraded, altered or customized to fit the needs of any experiment.
format Online
Article
Text
id pubmed-5963165
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-59631652018-05-24 SAMSA2: a standalone metatranscriptome analysis pipeline Westreich, Samuel T. Treiber, Michelle L. Mills, David A. Korf, Ian Lemay, Danielle G. BMC Bioinformatics Software BACKGROUND: Complex microbial communities are an area of growing interest in biology. Metatranscriptomics allows researchers to quantify microbial gene expression in an environmental sample via high-throughput sequencing. Metatranscriptomic experiments are computationally intensive because the experiments generate a large volume of sequence data and each sequence must be compared with reference sequences from thousands of organisms. RESULTS: SAMSA2 is an upgrade to the original Simple Annotation of Metatranscriptomes by Sequence Analysis (SAMSA) pipeline that has been redesigned for standalone use on a supercomputing cluster. SAMSA2 is faster due to the use of the DIAMOND aligner, and more flexible and reproducible because it uses local databases. SAMSA2 is available with detailed documentation, and example input and output files along with examples of master scripts for full pipeline execution. CONCLUSIONS: SAMSA2 is a rapid and efficient metatranscriptome pipeline for analyzing large RNA-seq datasets in a supercomputing cluster environment. SAMSA2 provides simplified output that can be examined directly or used for further analyses, and its reference databases may be upgraded, altered or customized to fit the needs of any experiment. BioMed Central 2018-05-21 /pmc/articles/PMC5963165/ /pubmed/29783945 http://dx.doi.org/10.1186/s12859-018-2189-z Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Westreich, Samuel T.
Treiber, Michelle L.
Mills, David A.
Korf, Ian
Lemay, Danielle G.
SAMSA2: a standalone metatranscriptome analysis pipeline
title SAMSA2: a standalone metatranscriptome analysis pipeline
title_full SAMSA2: a standalone metatranscriptome analysis pipeline
title_fullStr SAMSA2: a standalone metatranscriptome analysis pipeline
title_full_unstemmed SAMSA2: a standalone metatranscriptome analysis pipeline
title_short SAMSA2: a standalone metatranscriptome analysis pipeline
title_sort samsa2: a standalone metatranscriptome analysis pipeline
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5963165/
https://www.ncbi.nlm.nih.gov/pubmed/29783945
http://dx.doi.org/10.1186/s12859-018-2189-z
work_keys_str_mv AT westreichsamuelt samsa2astandalonemetatranscriptomeanalysispipeline
AT treibermichellel samsa2astandalonemetatranscriptomeanalysispipeline
AT millsdavida samsa2astandalonemetatranscriptomeanalysispipeline
AT korfian samsa2astandalonemetatranscriptomeanalysispipeline
AT lemaydanielleg samsa2astandalonemetatranscriptomeanalysispipeline