Cargando…

Next generation sequencing data of a defined microbial mock community

Generating sequence data of a defined community composed of organisms with complete reference genomes is indispensable for the benchmarking of new genome sequence analysis methods, including assembly and binning tools. Moreover the validation of new sequencing library protocols and platforms to asse...

Descripción completa

Detalles Bibliográficos
Autores principales: Singer, Esther, Andreopoulos, Bill, Bowers, Robert M., Lee, Janey, Deshpande, Shweta, Chiniquy, Jennifer, Ciobanu, Doina, Klenk, Hans-Peter, Zane, Matthew, Daum, Christopher, Clum, Alicia, Cheng, Jan-Fang, Copeland, Alex, Woyke, Tanja
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5037974/
https://www.ncbi.nlm.nih.gov/pubmed/27673566
http://dx.doi.org/10.1038/sdata.2016.81
_version_ 1782455853007241216
author Singer, Esther
Andreopoulos, Bill
Bowers, Robert M.
Lee, Janey
Deshpande, Shweta
Chiniquy, Jennifer
Ciobanu, Doina
Klenk, Hans-Peter
Zane, Matthew
Daum, Christopher
Clum, Alicia
Cheng, Jan-Fang
Copeland, Alex
Woyke, Tanja
author_facet Singer, Esther
Andreopoulos, Bill
Bowers, Robert M.
Lee, Janey
Deshpande, Shweta
Chiniquy, Jennifer
Ciobanu, Doina
Klenk, Hans-Peter
Zane, Matthew
Daum, Christopher
Clum, Alicia
Cheng, Jan-Fang
Copeland, Alex
Woyke, Tanja
author_sort Singer, Esther
collection PubMed
description Generating sequence data of a defined community composed of organisms with complete reference genomes is indispensable for the benchmarking of new genome sequence analysis methods, including assembly and binning tools. Moreover the validation of new sequencing library protocols and platforms to assess critical components such as sequencing errors and biases relies on such datasets. We here report the next generation metagenomic sequence data of a defined mock community (Mock Bacteria ARchaea Community; MBARC-26), composed of 23 bacterial and 3 archaeal strains with finished genomes. These strains span 10 phyla and 14 classes, a range of GC contents, genome sizes, repeat content and encompass a diverse abundance profile. Short read Illumina and long-read PacBio SMRT sequences of this mock community are described. These data represent a valuable resource for the scientific community, enabling extensive benchmarking and comparative evaluation of bioinformatics tools without the need to simulate data. As such, these data can aid in improving our current sequence data analysis toolkit and spur interest in the development of new tools.
format Online
Article
Text
id pubmed-5037974
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-50379742016-10-04 Next generation sequencing data of a defined microbial mock community Singer, Esther Andreopoulos, Bill Bowers, Robert M. Lee, Janey Deshpande, Shweta Chiniquy, Jennifer Ciobanu, Doina Klenk, Hans-Peter Zane, Matthew Daum, Christopher Clum, Alicia Cheng, Jan-Fang Copeland, Alex Woyke, Tanja Sci Data Data Descriptor Generating sequence data of a defined community composed of organisms with complete reference genomes is indispensable for the benchmarking of new genome sequence analysis methods, including assembly and binning tools. Moreover the validation of new sequencing library protocols and platforms to assess critical components such as sequencing errors and biases relies on such datasets. We here report the next generation metagenomic sequence data of a defined mock community (Mock Bacteria ARchaea Community; MBARC-26), composed of 23 bacterial and 3 archaeal strains with finished genomes. These strains span 10 phyla and 14 classes, a range of GC contents, genome sizes, repeat content and encompass a diverse abundance profile. Short read Illumina and long-read PacBio SMRT sequences of this mock community are described. These data represent a valuable resource for the scientific community, enabling extensive benchmarking and comparative evaluation of bioinformatics tools without the need to simulate data. As such, these data can aid in improving our current sequence data analysis toolkit and spur interest in the development of new tools. Nature Publishing Group 2016-09-27 /pmc/articles/PMC5037974/ /pubmed/27673566 http://dx.doi.org/10.1038/sdata.2016.81 Text en Copyright © 2016, The Author(s) http://creativecommons.org/licenses/by/4.0 This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0 Metadata associated with this Data Descriptor is available at http://www.nature.com/sdata/ and is released under the CC0 waiver to maximize reuse.
spellingShingle Data Descriptor
Singer, Esther
Andreopoulos, Bill
Bowers, Robert M.
Lee, Janey
Deshpande, Shweta
Chiniquy, Jennifer
Ciobanu, Doina
Klenk, Hans-Peter
Zane, Matthew
Daum, Christopher
Clum, Alicia
Cheng, Jan-Fang
Copeland, Alex
Woyke, Tanja
Next generation sequencing data of a defined microbial mock community
title Next generation sequencing data of a defined microbial mock community
title_full Next generation sequencing data of a defined microbial mock community
title_fullStr Next generation sequencing data of a defined microbial mock community
title_full_unstemmed Next generation sequencing data of a defined microbial mock community
title_short Next generation sequencing data of a defined microbial mock community
title_sort next generation sequencing data of a defined microbial mock community
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5037974/
https://www.ncbi.nlm.nih.gov/pubmed/27673566
http://dx.doi.org/10.1038/sdata.2016.81
work_keys_str_mv AT singeresther nextgenerationsequencingdataofadefinedmicrobialmockcommunity
AT andreopoulosbill nextgenerationsequencingdataofadefinedmicrobialmockcommunity
AT bowersrobertm nextgenerationsequencingdataofadefinedmicrobialmockcommunity
AT leejaney nextgenerationsequencingdataofadefinedmicrobialmockcommunity
AT deshpandeshweta nextgenerationsequencingdataofadefinedmicrobialmockcommunity
AT chiniquyjennifer nextgenerationsequencingdataofadefinedmicrobialmockcommunity
AT ciobanudoina nextgenerationsequencingdataofadefinedmicrobialmockcommunity
AT klenkhanspeter nextgenerationsequencingdataofadefinedmicrobialmockcommunity
AT zanematthew nextgenerationsequencingdataofadefinedmicrobialmockcommunity
AT daumchristopher nextgenerationsequencingdataofadefinedmicrobialmockcommunity
AT clumalicia nextgenerationsequencingdataofadefinedmicrobialmockcommunity
AT chengjanfang nextgenerationsequencingdataofadefinedmicrobialmockcommunity
AT copelandalex nextgenerationsequencingdataofadefinedmicrobialmockcommunity
AT woyketanja nextgenerationsequencingdataofadefinedmicrobialmockcommunity