Cargando…

Shotgun metagenome data of a defined mock community using Oxford Nanopore, PacBio and Illumina technologies

Metagenomic sequence data from defined mock communities is crucial for the assessment of sequencing platform performance and downstream analyses, including assembly, binning and taxonomic assignment. We report a comparison of shotgun metagenome sequencing and assembly metrics of a defined microbial...

Descripción completa

Detalles Bibliográficos
Autores principales: Sevim, Volkan, Lee, Juna, Egan, Robert, Clum, Alicia, Hundley, Hope, Lee, Janey, Everroad, R. Craig, Detweiler, Angela M., Bebout, Brad M., Pett-Ridge, Jennifer, Göker, Markus, Murray, Alison E., Lindemann, Stephen R., Klenk, Hans-Peter, O’Malley, Ronan, Zane, Matthew, Cheng, Jan-Fang, Copeland, Alex, Daum, Christopher, Singer, Esther, Woyke, Tanja
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6879543/
https://www.ncbi.nlm.nih.gov/pubmed/31772173
http://dx.doi.org/10.1038/s41597-019-0287-z
Descripción
Sumario:Metagenomic sequence data from defined mock communities is crucial for the assessment of sequencing platform performance and downstream analyses, including assembly, binning and taxonomic assignment. We report a comparison of shotgun metagenome sequencing and assembly metrics of a defined microbial mock community using the Oxford Nanopore Technologies (ONT) MinION, PacBio and Illumina sequencing platforms. Our synthetic microbial community BMock12 consists of 12 bacterial strains with genome sizes spanning 3.2–7.2 Mbp, 40–73% GC content, and 1.5–7.3% repeats. Size selection of both PacBio and ONT sequencing libraries prior to sequencing was essential to yield comparable relative abundances of organisms among all sequencing technologies. While the Illumina-based metagenome assembly yielded good coverage with few misassemblies, contiguity was greatly improved by both, Illumina + ONT and Illumina + PacBio hybrid assemblies but increased misassemblies, most notably in genomes with high sequence similarity to each other. Our resulting datasets allow evaluation and benchmarking of bioinformatics software on Illumina, PacBio and ONT platforms in parallel.