Cargando…

dBBQs: dataBase of Bacterial Quality scores

BACKGROUND: It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that t...

Descripción completa

Detalles Bibliográficos
Autores principales: Wanchai, Visanu, Patumcharoenpol, Preecha, Nookaew, Intawat, Ussery, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5751761/
https://www.ncbi.nlm.nih.gov/pubmed/29297289
http://dx.doi.org/10.1186/s12859-017-1900-9
_version_ 1783290012190638080
author Wanchai, Visanu
Patumcharoenpol, Preecha
Nookaew, Intawat
Ussery, David
author_facet Wanchai, Visanu
Patumcharoenpol, Preecha
Nookaew, Intawat
Ussery, David
author_sort Wanchai, Visanu
collection PubMed
description BACKGROUND: It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for around 100,000 bacterial genomes from all major genome repositories and put them in a fast and easy-to-use database. RESULTS: Prokaryotic genomic data from all sources were collected and combined to make a non-redundant set of bacterial genomes. The genome quality score for each was calculated by four different measurements: assembly quality, number of rRNA and tRNA genes, and the occurrence of conserved functional domains. The dataBase of Bacterial Quality scores (dBBQs) was designed to store and retrieve quality scores. It offers fast searching and download features which the result can be used for further analysis. In addition, the search results are shown in interactive JavaScript chart framework using DC.js. The analysis of quality scores across major public genome databases find that around 68% of the genomes are of acceptable quality for many uses. CONCLUSIONS: dBBQs (available at http://arc-gem.uams.edu/dbbqs) provides genome quality scores for all available prokaryotic genome sequences with a user-friendly Web-interface. These scores can be used as cut-offs to get a high-quality set of genomes for testing bioinformatics tools or improving the analysis. Moreover, all data of the four measurements that were combined to make the quality score for each genome, which can potentially be used for further analysis. dBBQs will be updated regularly and is freely use for non-commercial purpose.
format Online
Article
Text
id pubmed-5751761
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-57517612018-01-05 dBBQs: dataBase of Bacterial Quality scores Wanchai, Visanu Patumcharoenpol, Preecha Nookaew, Intawat Ussery, David BMC Bioinformatics Database BACKGROUND: It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for around 100,000 bacterial genomes from all major genome repositories and put them in a fast and easy-to-use database. RESULTS: Prokaryotic genomic data from all sources were collected and combined to make a non-redundant set of bacterial genomes. The genome quality score for each was calculated by four different measurements: assembly quality, number of rRNA and tRNA genes, and the occurrence of conserved functional domains. The dataBase of Bacterial Quality scores (dBBQs) was designed to store and retrieve quality scores. It offers fast searching and download features which the result can be used for further analysis. In addition, the search results are shown in interactive JavaScript chart framework using DC.js. The analysis of quality scores across major public genome databases find that around 68% of the genomes are of acceptable quality for many uses. CONCLUSIONS: dBBQs (available at http://arc-gem.uams.edu/dbbqs) provides genome quality scores for all available prokaryotic genome sequences with a user-friendly Web-interface. These scores can be used as cut-offs to get a high-quality set of genomes for testing bioinformatics tools or improving the analysis. Moreover, all data of the four measurements that were combined to make the quality score for each genome, which can potentially be used for further analysis. dBBQs will be updated regularly and is freely use for non-commercial purpose. BioMed Central 2017-12-28 /pmc/articles/PMC5751761/ /pubmed/29297289 http://dx.doi.org/10.1186/s12859-017-1900-9 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Database
Wanchai, Visanu
Patumcharoenpol, Preecha
Nookaew, Intawat
Ussery, David
dBBQs: dataBase of Bacterial Quality scores
title dBBQs: dataBase of Bacterial Quality scores
title_full dBBQs: dataBase of Bacterial Quality scores
title_fullStr dBBQs: dataBase of Bacterial Quality scores
title_full_unstemmed dBBQs: dataBase of Bacterial Quality scores
title_short dBBQs: dataBase of Bacterial Quality scores
title_sort dbbqs: database of bacterial quality scores
topic Database
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5751761/
https://www.ncbi.nlm.nih.gov/pubmed/29297289
http://dx.doi.org/10.1186/s12859-017-1900-9
work_keys_str_mv AT wanchaivisanu dbbqsdatabaseofbacterialqualityscores
AT patumcharoenpolpreecha dbbqsdatabaseofbacterialqualityscores
AT nookaewintawat dbbqsdatabaseofbacterialqualityscores
AT usserydavid dbbqsdatabaseofbacterialqualityscores