Cargando…

MetaBinG2: a fast and accurate metagenomic sequence classification system for samples with many unknown organisms

BACKGROUND: Many methods have been developed for metagenomic sequence classification, and most of them depend heavily on genome sequences of the known organisms. A large portion of sequencing sequences may be classified as unknown, which greatly impairs our understanding of the whole sample. RESULT:...

Descripción completa

Detalles Bibliográficos
Autores principales: Qiao, Yuyang, Jia, Ben, Hu, Zhiqiang, Sun, Chen, Xiang, Yijin, Wei, Chaochun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6104016/
https://www.ncbi.nlm.nih.gov/pubmed/30134953
http://dx.doi.org/10.1186/s13062-018-0220-y
Descripción
Sumario:BACKGROUND: Many methods have been developed for metagenomic sequence classification, and most of them depend heavily on genome sequences of the known organisms. A large portion of sequencing sequences may be classified as unknown, which greatly impairs our understanding of the whole sample. RESULT: Here we present MetaBinG2, a fast method for metagenomic sequence classification, especially for samples with a large number of unknown organisms. MetaBinG2 is based on sequence composition, and uses GPUs to accelerate its speed. A million 100 bp Illumina sequences can be classified in about 1 min on a computer with one GPU card. We evaluated MetaBinG2 by comparing it to multiple popular existing methods. We then applied MetaBinG2 to the dataset of MetaSUB Inter-City Challenge provided by CAMDA data analysis contest and compared community composition structures for environmental samples from different public places across cities. CONCLUSION: Compared to existing methods, MetaBinG2 is fast and accurate, especially for those samples with significant proportions of unknown organisms. REVIEWERS: This article was reviewed by Drs. Eran Elhaik, Nicolas Rascovan, and Serghei Mangul. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13062-018-0220-y) contains supplementary material, which is available to authorized users.