Cargando…
Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies
The recent advent of DNA sequencing technologies facilitates the use of genome sequencing data that provide means for more informative and precise classification and identification of members of the Bacteria and Archaea. Because the current species definition is based on the comparison of genome seq...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Microbiology Society
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5563544/ https://www.ncbi.nlm.nih.gov/pubmed/28005526 http://dx.doi.org/10.1099/ijsem.0.001755 |
_version_ | 1783258146797518848 |
---|---|
author | Yoon, Seok-Hwan Ha, Sung-Min Kwon, Soonjae Lim, Jeongmin Kim, Yeseul Seo, Hyungseok Chun, Jongsik |
author_facet | Yoon, Seok-Hwan Ha, Sung-Min Kwon, Soonjae Lim, Jeongmin Kim, Yeseul Seo, Hyungseok Chun, Jongsik |
author_sort | Yoon, Seok-Hwan |
collection | PubMed |
description | The recent advent of DNA sequencing technologies facilitates the use of genome sequencing data that provide means for more informative and precise classification and identification of members of the Bacteria and Archaea. Because the current species definition is based on the comparison of genome sequences between type and other strains in a given species, building a genome database with correct taxonomic information is of paramount need to enhance our efforts in exploring prokaryotic diversity and discovering novel species as well as for routine identifications. Here we introduce an integrated database, called EzBioCloud, that holds the taxonomic hierarchy of the Bacteria and Archaea, which is represented by quality-controlled 16S rRNA gene and genome sequences. Whole-genome assemblies in the NCBI Assembly Database were screened for low quality and subjected to a composite identification bioinformatics pipeline that employs gene-based searches followed by the calculation of average nucleotide identity. As a result, the database is made of 61 700 species/phylotypes, including 13 132 with validly published names, and 62 362 whole-genome assemblies that were identified taxonomically at the genus, species and subspecies levels. Genomic properties, such as genome size and DNA G+C content, and the occurrence in human microbiome data were calculated for each genus or higher taxa. This united database of taxonomy, 16S rRNA gene and genome sequences, with accompanying bioinformatics tools, should accelerate genome-based classification and identification of members of the Bacteria and Archaea. The database and related search tools are available at www.ezbiocloud.net/. |
format | Online Article Text |
id | pubmed-5563544 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Microbiology Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-55635442018-02-19 Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies Yoon, Seok-Hwan Ha, Sung-Min Kwon, Soonjae Lim, Jeongmin Kim, Yeseul Seo, Hyungseok Chun, Jongsik Int J Syst Evol Microbiol Research Article The recent advent of DNA sequencing technologies facilitates the use of genome sequencing data that provide means for more informative and precise classification and identification of members of the Bacteria and Archaea. Because the current species definition is based on the comparison of genome sequences between type and other strains in a given species, building a genome database with correct taxonomic information is of paramount need to enhance our efforts in exploring prokaryotic diversity and discovering novel species as well as for routine identifications. Here we introduce an integrated database, called EzBioCloud, that holds the taxonomic hierarchy of the Bacteria and Archaea, which is represented by quality-controlled 16S rRNA gene and genome sequences. Whole-genome assemblies in the NCBI Assembly Database were screened for low quality and subjected to a composite identification bioinformatics pipeline that employs gene-based searches followed by the calculation of average nucleotide identity. As a result, the database is made of 61 700 species/phylotypes, including 13 132 with validly published names, and 62 362 whole-genome assemblies that were identified taxonomically at the genus, species and subspecies levels. Genomic properties, such as genome size and DNA G+C content, and the occurrence in human microbiome data were calculated for each genus or higher taxa. This united database of taxonomy, 16S rRNA gene and genome sequences, with accompanying bioinformatics tools, should accelerate genome-based classification and identification of members of the Bacteria and Archaea. The database and related search tools are available at www.ezbiocloud.net/. Microbiology Society 2017-05 2017-05-30 /pmc/articles/PMC5563544/ /pubmed/28005526 http://dx.doi.org/10.1099/ijsem.0.001755 Text en © 2017 IUMS http://creativecommons.org/licenses/by/4.0/ This is an open access article under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Yoon, Seok-Hwan Ha, Sung-Min Kwon, Soonjae Lim, Jeongmin Kim, Yeseul Seo, Hyungseok Chun, Jongsik Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies |
title | Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies |
title_full | Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies |
title_fullStr | Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies |
title_full_unstemmed | Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies |
title_short | Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies |
title_sort | introducing ezbiocloud: a taxonomically united database of 16s rrna gene sequences and whole-genome assemblies |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5563544/ https://www.ncbi.nlm.nih.gov/pubmed/28005526 http://dx.doi.org/10.1099/ijsem.0.001755 |
work_keys_str_mv | AT yoonseokhwan introducingezbiocloudataxonomicallyuniteddatabaseof16srrnagenesequencesandwholegenomeassemblies AT hasungmin introducingezbiocloudataxonomicallyuniteddatabaseof16srrnagenesequencesandwholegenomeassemblies AT kwonsoonjae introducingezbiocloudataxonomicallyuniteddatabaseof16srrnagenesequencesandwholegenomeassemblies AT limjeongmin introducingezbiocloudataxonomicallyuniteddatabaseof16srrnagenesequencesandwholegenomeassemblies AT kimyeseul introducingezbiocloudataxonomicallyuniteddatabaseof16srrnagenesequencesandwholegenomeassemblies AT seohyungseok introducingezbiocloudataxonomicallyuniteddatabaseof16srrnagenesequencesandwholegenomeassemblies AT chunjongsik introducingezbiocloudataxonomicallyuniteddatabaseof16srrnagenesequencesandwholegenomeassemblies |