Cargando…
metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences
Environmental shotgun sequencing (ESS) has potential to give greater insight into microbial communities than targeted sequencing of 16S regions, but requires much higher sequence coverage. The advent of next-generation sequencing has made it feasible for the Human Microbiome Project and other initia...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3622627/ https://www.ncbi.nlm.nih.gov/pubmed/23734710 http://dx.doi.org/10.1186/1471-2105-14-S5-S2 |
_version_ | 1782265856445644800 |
---|---|
author | Ander, Christina Schulz-Trieglaff, Ole B Stoye, Jens Cox, Anthony J |
author_facet | Ander, Christina Schulz-Trieglaff, Ole B Stoye, Jens Cox, Anthony J |
author_sort | Ander, Christina |
collection | PubMed |
description | Environmental shotgun sequencing (ESS) has potential to give greater insight into microbial communities than targeted sequencing of 16S regions, but requires much higher sequence coverage. The advent of next-generation sequencing has made it feasible for the Human Microbiome Project and other initiatives to generate ESS data on a large scale, but computationally efficient methods for analysing such data sets are needed. Here we present metaBEETL, a fast taxonomic classifier for environmental shotgun sequences. It uses a Burrows-Wheeler Transform (BWT) index of the sequencing reads and an indexed database of microbial reference sequences. Unlike other BWT-based tools, our method has no upper limit on the number or the total size of the reference sequences in its database. By capturing sequence relationships between strains, our reference index also allows us to classify reads which are not unique to an individual strain but are nevertheless specific to some higher phylogenetic order. Tested on datasets with known taxonomic composition, metaBEETL gave results that are competitive with existing similarity-based tools: due to normalization steps which other classifiers lack, the taxonomic profile computed by metaBEETL closely matched the true environmental profile. At the same time, its moderate running time and low memory footprint allow metaBEETL to scale well to large data sets. Code to construct the BWT indexed database and for the taxonomic classification is part of the BEETL library, available as a github repository at git@github.com:BEETL/BEETL.git. |
format | Online Article Text |
id | pubmed-3622627 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-36226272013-04-15 metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences Ander, Christina Schulz-Trieglaff, Ole B Stoye, Jens Cox, Anthony J BMC Bioinformatics Proceedings Environmental shotgun sequencing (ESS) has potential to give greater insight into microbial communities than targeted sequencing of 16S regions, but requires much higher sequence coverage. The advent of next-generation sequencing has made it feasible for the Human Microbiome Project and other initiatives to generate ESS data on a large scale, but computationally efficient methods for analysing such data sets are needed. Here we present metaBEETL, a fast taxonomic classifier for environmental shotgun sequences. It uses a Burrows-Wheeler Transform (BWT) index of the sequencing reads and an indexed database of microbial reference sequences. Unlike other BWT-based tools, our method has no upper limit on the number or the total size of the reference sequences in its database. By capturing sequence relationships between strains, our reference index also allows us to classify reads which are not unique to an individual strain but are nevertheless specific to some higher phylogenetic order. Tested on datasets with known taxonomic composition, metaBEETL gave results that are competitive with existing similarity-based tools: due to normalization steps which other classifiers lack, the taxonomic profile computed by metaBEETL closely matched the true environmental profile. At the same time, its moderate running time and low memory footprint allow metaBEETL to scale well to large data sets. Code to construct the BWT indexed database and for the taxonomic classification is part of the BEETL library, available as a github repository at git@github.com:BEETL/BEETL.git. BioMed Central 2013-04-10 /pmc/articles/PMC3622627/ /pubmed/23734710 http://dx.doi.org/10.1186/1471-2105-14-S5-S2 Text en Copyright © 2013 Ander et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Ander, Christina Schulz-Trieglaff, Ole B Stoye, Jens Cox, Anthony J metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences |
title | metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences |
title_full | metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences |
title_fullStr | metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences |
title_full_unstemmed | metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences |
title_short | metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences |
title_sort | metabeetl: high-throughput analysis of heterogeneous microbial populations from shotgun dna sequences |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3622627/ https://www.ncbi.nlm.nih.gov/pubmed/23734710 http://dx.doi.org/10.1186/1471-2105-14-S5-S2 |
work_keys_str_mv | AT anderchristina metabeetlhighthroughputanalysisofheterogeneousmicrobialpopulationsfromshotgundnasequences AT schulztrieglaffoleb metabeetlhighthroughputanalysisofheterogeneousmicrobialpopulationsfromshotgundnasequences AT stoyejens metabeetlhighthroughputanalysisofheterogeneousmicrobialpopulationsfromshotgundnasequences AT coxanthonyj metabeetlhighthroughputanalysisofheterogeneousmicrobialpopulationsfromshotgundnasequences |