Cargando…

metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences

Environmental shotgun sequencing (ESS) has potential to give greater insight into microbial communities than targeted sequencing of 16S regions, but requires much higher sequence coverage. The advent of next-generation sequencing has made it feasible for the Human Microbiome Project and other initia...

Descripción completa

Detalles Bibliográficos
Autores principales: Ander, Christina, Schulz-Trieglaff, Ole B, Stoye, Jens, Cox, Anthony J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3622627/
https://www.ncbi.nlm.nih.gov/pubmed/23734710
http://dx.doi.org/10.1186/1471-2105-14-S5-S2
_version_ 1782265856445644800
author Ander, Christina
Schulz-Trieglaff, Ole B
Stoye, Jens
Cox, Anthony J
author_facet Ander, Christina
Schulz-Trieglaff, Ole B
Stoye, Jens
Cox, Anthony J
author_sort Ander, Christina
collection PubMed
description Environmental shotgun sequencing (ESS) has potential to give greater insight into microbial communities than targeted sequencing of 16S regions, but requires much higher sequence coverage. The advent of next-generation sequencing has made it feasible for the Human Microbiome Project and other initiatives to generate ESS data on a large scale, but computationally efficient methods for analysing such data sets are needed. Here we present metaBEETL, a fast taxonomic classifier for environmental shotgun sequences. It uses a Burrows-Wheeler Transform (BWT) index of the sequencing reads and an indexed database of microbial reference sequences. Unlike other BWT-based tools, our method has no upper limit on the number or the total size of the reference sequences in its database. By capturing sequence relationships between strains, our reference index also allows us to classify reads which are not unique to an individual strain but are nevertheless specific to some higher phylogenetic order. Tested on datasets with known taxonomic composition, metaBEETL gave results that are competitive with existing similarity-based tools: due to normalization steps which other classifiers lack, the taxonomic profile computed by metaBEETL closely matched the true environmental profile. At the same time, its moderate running time and low memory footprint allow metaBEETL to scale well to large data sets. Code to construct the BWT indexed database and for the taxonomic classification is part of the BEETL library, available as a github repository at git@github.com:BEETL/BEETL.git.
format Online
Article
Text
id pubmed-3622627
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36226272013-04-15 metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences Ander, Christina Schulz-Trieglaff, Ole B Stoye, Jens Cox, Anthony J BMC Bioinformatics Proceedings Environmental shotgun sequencing (ESS) has potential to give greater insight into microbial communities than targeted sequencing of 16S regions, but requires much higher sequence coverage. The advent of next-generation sequencing has made it feasible for the Human Microbiome Project and other initiatives to generate ESS data on a large scale, but computationally efficient methods for analysing such data sets are needed. Here we present metaBEETL, a fast taxonomic classifier for environmental shotgun sequences. It uses a Burrows-Wheeler Transform (BWT) index of the sequencing reads and an indexed database of microbial reference sequences. Unlike other BWT-based tools, our method has no upper limit on the number or the total size of the reference sequences in its database. By capturing sequence relationships between strains, our reference index also allows us to classify reads which are not unique to an individual strain but are nevertheless specific to some higher phylogenetic order. Tested on datasets with known taxonomic composition, metaBEETL gave results that are competitive with existing similarity-based tools: due to normalization steps which other classifiers lack, the taxonomic profile computed by metaBEETL closely matched the true environmental profile. At the same time, its moderate running time and low memory footprint allow metaBEETL to scale well to large data sets. Code to construct the BWT indexed database and for the taxonomic classification is part of the BEETL library, available as a github repository at git@github.com:BEETL/BEETL.git. BioMed Central 2013-04-10 /pmc/articles/PMC3622627/ /pubmed/23734710 http://dx.doi.org/10.1186/1471-2105-14-S5-S2 Text en Copyright © 2013 Ander et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Ander, Christina
Schulz-Trieglaff, Ole B
Stoye, Jens
Cox, Anthony J
metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences
title metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences
title_full metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences
title_fullStr metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences
title_full_unstemmed metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences
title_short metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences
title_sort metabeetl: high-throughput analysis of heterogeneous microbial populations from shotgun dna sequences
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3622627/
https://www.ncbi.nlm.nih.gov/pubmed/23734710
http://dx.doi.org/10.1186/1471-2105-14-S5-S2
work_keys_str_mv AT anderchristina metabeetlhighthroughputanalysisofheterogeneousmicrobialpopulationsfromshotgundnasequences
AT schulztrieglaffoleb metabeetlhighthroughputanalysisofheterogeneousmicrobialpopulationsfromshotgundnasequences
AT stoyejens metabeetlhighthroughputanalysisofheterogeneousmicrobialpopulationsfromshotgundnasequences
AT coxanthonyj metabeetlhighthroughputanalysisofheterogeneousmicrobialpopulationsfromshotgundnasequences