Cargando…

Meta-IDBA: a de Novo assembler for metagenomic data

Motivation: Next-generation sequencing techniques allow us to generate reads from a microbial environment in order to analyze the microbial community. However, assembling of a set of mixed reads from different species to form contigs is a bottleneck of metagenomic research. Although there are many a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Peng, Yu, Leung, Henry C. M., Yiu, S. M., Chin, Francis Y. L.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2011
Materias:	Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117360/ https://www.ncbi.nlm.nih.gov/pubmed/21685107 http://dx.doi.org/10.1093/bioinformatics/btr216

_version_	1782206322337382400
author	Peng, Yu Leung, Henry C. M. Yiu, S. M. Chin, Francis Y. L.
author_facet	Peng, Yu Leung, Henry C. M. Yiu, S. M. Chin, Francis Y. L.
author_sort	Peng, Yu
collection	PubMed
description	Motivation: Next-generation sequencing techniques allow us to generate reads from a microbial environment in order to analyze the microbial community. However, assembling of a set of mixed reads from different species to form contigs is a bottleneck of metagenomic research. Although there are many assemblers for assembling reads from a single genome, there are no assemblers for assembling reads in metagenomic data without reference genome sequences. Moreover, the performances of these assemblers on metagenomic data are far from satisfactory, because of the existence of common regions in the genomes of subspecies and species, which make the assembly problem much more complicated. Results: We introduce the Meta-IDBA algorithm for assembling reads in metagenomic data, which contain multiple genomes from different species. There are two core steps in Meta-IDBA. It first tries to partition the de Bruijn graph into isolated components of different species based on an important observation. Then, for each component, it captures the slight variants of the genomes of subspecies from the same species by multiple alignments and represents the genome of one species, using a consensus sequence. Comparison of the performances of Meta-IDBA and existing assemblers, such as Velvet and Abyss for different metagenomic datasets shows that Meta-IDBA can reconstruct longer contigs with similar accuracy. Availability: Meta-IDBA toolkit is available at our website http://www.cs.hku.hk/~alse/metaidba. Contact: chin@cs.hku.hk
format	Online Article Text
id	pubmed-3117360
institution	National Center for Biotechnology Information
language	English
publishDate	2011
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-31173602011-06-17 Meta-IDBA: a de Novo assembler for metagenomic data Peng, Yu Leung, Henry C. M. Yiu, S. M. Chin, Francis Y. L. Bioinformatics Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria Motivation: Next-generation sequencing techniques allow us to generate reads from a microbial environment in order to analyze the microbial community. However, assembling of a set of mixed reads from different species to form contigs is a bottleneck of metagenomic research. Although there are many assemblers for assembling reads from a single genome, there are no assemblers for assembling reads in metagenomic data without reference genome sequences. Moreover, the performances of these assemblers on metagenomic data are far from satisfactory, because of the existence of common regions in the genomes of subspecies and species, which make the assembly problem much more complicated. Results: We introduce the Meta-IDBA algorithm for assembling reads in metagenomic data, which contain multiple genomes from different species. There are two core steps in Meta-IDBA. It first tries to partition the de Bruijn graph into isolated components of different species based on an important observation. Then, for each component, it captures the slight variants of the genomes of subspecies from the same species by multiple alignments and represents the genome of one species, using a consensus sequence. Comparison of the performances of Meta-IDBA and existing assemblers, such as Velvet and Abyss for different metagenomic datasets shows that Meta-IDBA can reconstruct longer contigs with similar accuracy. Availability: Meta-IDBA toolkit is available at our website http://www.cs.hku.hk/~alse/metaidba. Contact: chin@cs.hku.hk Oxford University Press 2011-07-01 2011-06-14 /pmc/articles/PMC3117360/ /pubmed/21685107 http://dx.doi.org/10.1093/bioinformatics/btr216 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria Peng, Yu Leung, Henry C. M. Yiu, S. M. Chin, Francis Y. L. Meta-IDBA: a de Novo assembler for metagenomic data
title	Meta-IDBA: a de Novo assembler for metagenomic data
title_full	Meta-IDBA: a de Novo assembler for metagenomic data
title_fullStr	Meta-IDBA: a de Novo assembler for metagenomic data
title_full_unstemmed	Meta-IDBA: a de Novo assembler for metagenomic data
title_short	Meta-IDBA: a de Novo assembler for metagenomic data
title_sort	meta-idba: a de novo assembler for metagenomic data
topic	Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117360/ https://www.ncbi.nlm.nih.gov/pubmed/21685107 http://dx.doi.org/10.1093/bioinformatics/btr216
work_keys_str_mv	AT pengyu metaidbaadenovoassemblerformetagenomicdata AT leunghenrycm metaidbaadenovoassemblerformetagenomicdata AT yiusm metaidbaadenovoassemblerformetagenomicdata AT chinfrancisyl metaidbaadenovoassemblerformetagenomicdata

Meta-IDBA: a de Novo assembler for metagenomic data

Ejemplares similares