Cargando…
GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts
SUMMARY: We present an updated version of our computational pipeline, PathSeq, for the discovery and identification of microbial sequences in genomic and transcriptomic libraries from eukaryotic hosts. This pipeline is available in the Genome Analysis Toolkit (GATK) as a suite of configurable tools...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6289130/ https://www.ncbi.nlm.nih.gov/pubmed/29982281 http://dx.doi.org/10.1093/bioinformatics/bty501 |
_version_ | 1783379928521113600 |
---|---|
author | Walker, Mark A Pedamallu, Chandra Sekhar Ojesina, Akinyemi I Bullman, Susan Sharpe, Ted Whelan, Christopher W Meyerson, Matthew |
author_facet | Walker, Mark A Pedamallu, Chandra Sekhar Ojesina, Akinyemi I Bullman, Susan Sharpe, Ted Whelan, Christopher W Meyerson, Matthew |
author_sort | Walker, Mark A |
collection | PubMed |
description | SUMMARY: We present an updated version of our computational pipeline, PathSeq, for the discovery and identification of microbial sequences in genomic and transcriptomic libraries from eukaryotic hosts. This pipeline is available in the Genome Analysis Toolkit (GATK) as a suite of configurable tools that can report the microbial composition of DNA or RNA short-read sequencing samples and identify unknown sequences for downstream assembly of novel organisms. GATK PathSeq enables sample analysis in minutes at low cost. In addition, these tools are built with the GATK engine and Apache Spark framework, providing robust, rapid parallelization of read quality filtering, host subtraction and microbial alignment in workstation, cluster and cloud environments. AVAILABILITY AND IMPLEMENTATION: These tools are available as a part of the GATK at https://github.com/broadinstitute/gatk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-6289130 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-62891302018-12-14 GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts Walker, Mark A Pedamallu, Chandra Sekhar Ojesina, Akinyemi I Bullman, Susan Sharpe, Ted Whelan, Christopher W Meyerson, Matthew Bioinformatics Applications Notes SUMMARY: We present an updated version of our computational pipeline, PathSeq, for the discovery and identification of microbial sequences in genomic and transcriptomic libraries from eukaryotic hosts. This pipeline is available in the Genome Analysis Toolkit (GATK) as a suite of configurable tools that can report the microbial composition of DNA or RNA short-read sequencing samples and identify unknown sequences for downstream assembly of novel organisms. GATK PathSeq enables sample analysis in minutes at low cost. In addition, these tools are built with the GATK engine and Apache Spark framework, providing robust, rapid parallelization of read quality filtering, host subtraction and microbial alignment in workstation, cluster and cloud environments. AVAILABILITY AND IMPLEMENTATION: These tools are available as a part of the GATK at https://github.com/broadinstitute/gatk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2018-12-15 2018-07-04 /pmc/articles/PMC6289130/ /pubmed/29982281 http://dx.doi.org/10.1093/bioinformatics/bty501 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Notes Walker, Mark A Pedamallu, Chandra Sekhar Ojesina, Akinyemi I Bullman, Susan Sharpe, Ted Whelan, Christopher W Meyerson, Matthew GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts |
title | GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts |
title_full | GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts |
title_fullStr | GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts |
title_full_unstemmed | GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts |
title_short | GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts |
title_sort | gatk pathseq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts |
topic | Applications Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6289130/ https://www.ncbi.nlm.nih.gov/pubmed/29982281 http://dx.doi.org/10.1093/bioinformatics/bty501 |
work_keys_str_mv | AT walkermarka gatkpathseqacustomizablecomputationaltoolforthediscoveryandidentificationofmicrobialsequencesinlibrariesfromeukaryotichosts AT pedamalluchandrasekhar gatkpathseqacustomizablecomputationaltoolforthediscoveryandidentificationofmicrobialsequencesinlibrariesfromeukaryotichosts AT ojesinaakinyemii gatkpathseqacustomizablecomputationaltoolforthediscoveryandidentificationofmicrobialsequencesinlibrariesfromeukaryotichosts AT bullmansusan gatkpathseqacustomizablecomputationaltoolforthediscoveryandidentificationofmicrobialsequencesinlibrariesfromeukaryotichosts AT sharpeted gatkpathseqacustomizablecomputationaltoolforthediscoveryandidentificationofmicrobialsequencesinlibrariesfromeukaryotichosts AT whelanchristopherw gatkpathseqacustomizablecomputationaltoolforthediscoveryandidentificationofmicrobialsequencesinlibrariesfromeukaryotichosts AT meyersonmatthew gatkpathseqacustomizablecomputationaltoolforthediscoveryandidentificationofmicrobialsequencesinlibrariesfromeukaryotichosts |