Cargando…

GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts

SUMMARY: We present an updated version of our computational pipeline, PathSeq, for the discovery and identification of microbial sequences in genomic and transcriptomic libraries from eukaryotic hosts. This pipeline is available in the Genome Analysis Toolkit (GATK) as a suite of configurable tools...

Descripción completa

Detalles Bibliográficos
Autores principales: Walker, Mark A, Pedamallu, Chandra Sekhar, Ojesina, Akinyemi I, Bullman, Susan, Sharpe, Ted, Whelan, Christopher W, Meyerson, Matthew
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6289130/
https://www.ncbi.nlm.nih.gov/pubmed/29982281
http://dx.doi.org/10.1093/bioinformatics/bty501
_version_ 1783379928521113600
author Walker, Mark A
Pedamallu, Chandra Sekhar
Ojesina, Akinyemi I
Bullman, Susan
Sharpe, Ted
Whelan, Christopher W
Meyerson, Matthew
author_facet Walker, Mark A
Pedamallu, Chandra Sekhar
Ojesina, Akinyemi I
Bullman, Susan
Sharpe, Ted
Whelan, Christopher W
Meyerson, Matthew
author_sort Walker, Mark A
collection PubMed
description SUMMARY: We present an updated version of our computational pipeline, PathSeq, for the discovery and identification of microbial sequences in genomic and transcriptomic libraries from eukaryotic hosts. This pipeline is available in the Genome Analysis Toolkit (GATK) as a suite of configurable tools that can report the microbial composition of DNA or RNA short-read sequencing samples and identify unknown sequences for downstream assembly of novel organisms. GATK PathSeq enables sample analysis in minutes at low cost. In addition, these tools are built with the GATK engine and Apache Spark framework, providing robust, rapid parallelization of read quality filtering, host subtraction and microbial alignment in workstation, cluster and cloud environments. AVAILABILITY AND IMPLEMENTATION: These tools are available as a part of the GATK at https://github.com/broadinstitute/gatk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6289130
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-62891302018-12-14 GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts Walker, Mark A Pedamallu, Chandra Sekhar Ojesina, Akinyemi I Bullman, Susan Sharpe, Ted Whelan, Christopher W Meyerson, Matthew Bioinformatics Applications Notes SUMMARY: We present an updated version of our computational pipeline, PathSeq, for the discovery and identification of microbial sequences in genomic and transcriptomic libraries from eukaryotic hosts. This pipeline is available in the Genome Analysis Toolkit (GATK) as a suite of configurable tools that can report the microbial composition of DNA or RNA short-read sequencing samples and identify unknown sequences for downstream assembly of novel organisms. GATK PathSeq enables sample analysis in minutes at low cost. In addition, these tools are built with the GATK engine and Apache Spark framework, providing robust, rapid parallelization of read quality filtering, host subtraction and microbial alignment in workstation, cluster and cloud environments. AVAILABILITY AND IMPLEMENTATION: These tools are available as a part of the GATK at https://github.com/broadinstitute/gatk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2018-12-15 2018-07-04 /pmc/articles/PMC6289130/ /pubmed/29982281 http://dx.doi.org/10.1093/bioinformatics/bty501 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Notes
Walker, Mark A
Pedamallu, Chandra Sekhar
Ojesina, Akinyemi I
Bullman, Susan
Sharpe, Ted
Whelan, Christopher W
Meyerson, Matthew
GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts
title GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts
title_full GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts
title_fullStr GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts
title_full_unstemmed GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts
title_short GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts
title_sort gatk pathseq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts
topic Applications Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6289130/
https://www.ncbi.nlm.nih.gov/pubmed/29982281
http://dx.doi.org/10.1093/bioinformatics/bty501
work_keys_str_mv AT walkermarka gatkpathseqacustomizablecomputationaltoolforthediscoveryandidentificationofmicrobialsequencesinlibrariesfromeukaryotichosts
AT pedamalluchandrasekhar gatkpathseqacustomizablecomputationaltoolforthediscoveryandidentificationofmicrobialsequencesinlibrariesfromeukaryotichosts
AT ojesinaakinyemii gatkpathseqacustomizablecomputationaltoolforthediscoveryandidentificationofmicrobialsequencesinlibrariesfromeukaryotichosts
AT bullmansusan gatkpathseqacustomizablecomputationaltoolforthediscoveryandidentificationofmicrobialsequencesinlibrariesfromeukaryotichosts
AT sharpeted gatkpathseqacustomizablecomputationaltoolforthediscoveryandidentificationofmicrobialsequencesinlibrariesfromeukaryotichosts
AT whelanchristopherw gatkpathseqacustomizablecomputationaltoolforthediscoveryandidentificationofmicrobialsequencesinlibrariesfromeukaryotichosts
AT meyersonmatthew gatkpathseqacustomizablecomputationaltoolforthediscoveryandidentificationofmicrobialsequencesinlibrariesfromeukaryotichosts