Cargando…

Xander: employing a novel method for efficient gene-targeted metagenomic assembly

BACKGROUND: Metagenomics can provide important insight into microbial communities. However, assembling metagenomic datasets has proven to be computationally challenging. Current methods often assemble only fragmented partial genes. RESULTS: We present a novel method for targeting assembly of specifi...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Qiong, Fish, Jordan A., Gilman, Mariah, Sun, Yanni, Brown, C. Titus, Tiedje, James M., Cole, James R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4526283/
https://www.ncbi.nlm.nih.gov/pubmed/26246894
http://dx.doi.org/10.1186/s40168-015-0093-6
_version_ 1782384400198008832
author Wang, Qiong
Fish, Jordan A.
Gilman, Mariah
Sun, Yanni
Brown, C. Titus
Tiedje, James M.
Cole, James R.
author_facet Wang, Qiong
Fish, Jordan A.
Gilman, Mariah
Sun, Yanni
Brown, C. Titus
Tiedje, James M.
Cole, James R.
author_sort Wang, Qiong
collection PubMed
description BACKGROUND: Metagenomics can provide important insight into microbial communities. However, assembling metagenomic datasets has proven to be computationally challenging. Current methods often assemble only fragmented partial genes. RESULTS: We present a novel method for targeting assembly of specific protein-coding genes. This method combines a de Bruijn graph, as used in standard assembly approaches, and a protein profile hidden Markov model (HMM) for the gene of interest, as used in standard annotation approaches. These are used to create a novel combined weighted assembly graph. Xander performs both assembly and annotation concomitantly using information incorporated in this graph. We demonstrate the utility of this approach by assembling contigs for one phylogenetic marker gene and for two functional marker genes, first on Human Microbiome Project (HMP)-defined community Illumina data and then on 21 rhizosphere soil metagenomic datasets from three different crops totaling over 800 Gbp of unassembled data. We compared our method to a recently published bulk metagenome assembly method and a recently published gene-targeted assembler and found our method produced more, longer, and higher quality gene sequences. CONCLUSION: Xander combines gene assignment with the rapid assembly of full-length or near full-length functional genes from metagenomic data without requiring bulk assembly or post-processing to find genes of interest. HMMs used for assembly can be tailored to the targeted genes, allowing flexibility to improve annotation over generic annotation pipelines. This method is implemented as open source software and is available at https://github.com/rdpstaff/Xander_assembler. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40168-015-0093-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4526283
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45262832015-08-06 Xander: employing a novel method for efficient gene-targeted metagenomic assembly Wang, Qiong Fish, Jordan A. Gilman, Mariah Sun, Yanni Brown, C. Titus Tiedje, James M. Cole, James R. Microbiome Software BACKGROUND: Metagenomics can provide important insight into microbial communities. However, assembling metagenomic datasets has proven to be computationally challenging. Current methods often assemble only fragmented partial genes. RESULTS: We present a novel method for targeting assembly of specific protein-coding genes. This method combines a de Bruijn graph, as used in standard assembly approaches, and a protein profile hidden Markov model (HMM) for the gene of interest, as used in standard annotation approaches. These are used to create a novel combined weighted assembly graph. Xander performs both assembly and annotation concomitantly using information incorporated in this graph. We demonstrate the utility of this approach by assembling contigs for one phylogenetic marker gene and for two functional marker genes, first on Human Microbiome Project (HMP)-defined community Illumina data and then on 21 rhizosphere soil metagenomic datasets from three different crops totaling over 800 Gbp of unassembled data. We compared our method to a recently published bulk metagenome assembly method and a recently published gene-targeted assembler and found our method produced more, longer, and higher quality gene sequences. CONCLUSION: Xander combines gene assignment with the rapid assembly of full-length or near full-length functional genes from metagenomic data without requiring bulk assembly or post-processing to find genes of interest. HMMs used for assembly can be tailored to the targeted genes, allowing flexibility to improve annotation over generic annotation pipelines. This method is implemented as open source software and is available at https://github.com/rdpstaff/Xander_assembler. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40168-015-0093-6) contains supplementary material, which is available to authorized users. BioMed Central 2015-08-05 /pmc/articles/PMC4526283/ /pubmed/26246894 http://dx.doi.org/10.1186/s40168-015-0093-6 Text en © Wang et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Wang, Qiong
Fish, Jordan A.
Gilman, Mariah
Sun, Yanni
Brown, C. Titus
Tiedje, James M.
Cole, James R.
Xander: employing a novel method for efficient gene-targeted metagenomic assembly
title Xander: employing a novel method for efficient gene-targeted metagenomic assembly
title_full Xander: employing a novel method for efficient gene-targeted metagenomic assembly
title_fullStr Xander: employing a novel method for efficient gene-targeted metagenomic assembly
title_full_unstemmed Xander: employing a novel method for efficient gene-targeted metagenomic assembly
title_short Xander: employing a novel method for efficient gene-targeted metagenomic assembly
title_sort xander: employing a novel method for efficient gene-targeted metagenomic assembly
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4526283/
https://www.ncbi.nlm.nih.gov/pubmed/26246894
http://dx.doi.org/10.1186/s40168-015-0093-6
work_keys_str_mv AT wangqiong xanderemployinganovelmethodforefficientgenetargetedmetagenomicassembly
AT fishjordana xanderemployinganovelmethodforefficientgenetargetedmetagenomicassembly
AT gilmanmariah xanderemployinganovelmethodforefficientgenetargetedmetagenomicassembly
AT sunyanni xanderemployinganovelmethodforefficientgenetargetedmetagenomicassembly
AT brownctitus xanderemployinganovelmethodforefficientgenetargetedmetagenomicassembly
AT tiedjejamesm xanderemployinganovelmethodforefficientgenetargetedmetagenomicassembly
AT colejamesr xanderemployinganovelmethodforefficientgenetargetedmetagenomicassembly