Cargando…
MetaBinG2: a fast and accurate metagenomic sequence classification system for samples with many unknown organisms
BACKGROUND: Many methods have been developed for metagenomic sequence classification, and most of them depend heavily on genome sequences of the known organisms. A large portion of sequencing sequences may be classified as unknown, which greatly impairs our understanding of the whole sample. RESULT:...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6104016/ https://www.ncbi.nlm.nih.gov/pubmed/30134953 http://dx.doi.org/10.1186/s13062-018-0220-y |
Sumario: | BACKGROUND: Many methods have been developed for metagenomic sequence classification, and most of them depend heavily on genome sequences of the known organisms. A large portion of sequencing sequences may be classified as unknown, which greatly impairs our understanding of the whole sample. RESULT: Here we present MetaBinG2, a fast method for metagenomic sequence classification, especially for samples with a large number of unknown organisms. MetaBinG2 is based on sequence composition, and uses GPUs to accelerate its speed. A million 100 bp Illumina sequences can be classified in about 1 min on a computer with one GPU card. We evaluated MetaBinG2 by comparing it to multiple popular existing methods. We then applied MetaBinG2 to the dataset of MetaSUB Inter-City Challenge provided by CAMDA data analysis contest and compared community composition structures for environmental samples from different public places across cities. CONCLUSION: Compared to existing methods, MetaBinG2 is fast and accurate, especially for those samples with significant proportions of unknown organisms. REVIEWERS: This article was reviewed by Drs. Eran Elhaik, Nicolas Rascovan, and Serghei Mangul. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13062-018-0220-y) contains supplementary material, which is available to authorized users. |
---|