Cargando…
VFDB 2016: hierarchical and refined dataset for big data analysis—10 years on
The virulence factor database (VFDB, http://www.mgc.ac.cn/VFs/) is dedicated to providing up-to-date knowledge of virulence factors (VFs) of various bacterial pathogens. Since its inception the VFDB has served as a comprehensive repository of bacterial VFs for over a decade. The exponential growth i...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4702877/ https://www.ncbi.nlm.nih.gov/pubmed/26578559 http://dx.doi.org/10.1093/nar/gkv1239 |
_version_ | 1782408670714265600 |
---|---|
author | Chen, Lihong Zheng, Dandan Liu, Bo Yang, Jian Jin, Qi |
author_facet | Chen, Lihong Zheng, Dandan Liu, Bo Yang, Jian Jin, Qi |
author_sort | Chen, Lihong |
collection | PubMed |
description | The virulence factor database (VFDB, http://www.mgc.ac.cn/VFs/) is dedicated to providing up-to-date knowledge of virulence factors (VFs) of various bacterial pathogens. Since its inception the VFDB has served as a comprehensive repository of bacterial VFs for over a decade. The exponential growth in the amount of biological data is challenging to the current database in regard to big data analysis. We recently improved two aspects of the infrastructural dataset of VFDB: (i) removed the redundancy introduced by previous releases and generated two hierarchical datasets – one core dataset of experimentally verified VFs only and another full dataset including all known and predicted VFs and (ii) refined the gene annotation of the core dataset with controlled vocabularies. Our efforts enhanced the data quality of the VFDB and promoted the usability of the database in the big data era for the bioinformatic mining of the explosively growing data regarding bacterial VFs. |
format | Online Article Text |
id | pubmed-4702877 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-47028772016-01-07 VFDB 2016: hierarchical and refined dataset for big data analysis—10 years on Chen, Lihong Zheng, Dandan Liu, Bo Yang, Jian Jin, Qi Nucleic Acids Res Database Issue The virulence factor database (VFDB, http://www.mgc.ac.cn/VFs/) is dedicated to providing up-to-date knowledge of virulence factors (VFs) of various bacterial pathogens. Since its inception the VFDB has served as a comprehensive repository of bacterial VFs for over a decade. The exponential growth in the amount of biological data is challenging to the current database in regard to big data analysis. We recently improved two aspects of the infrastructural dataset of VFDB: (i) removed the redundancy introduced by previous releases and generated two hierarchical datasets – one core dataset of experimentally verified VFs only and another full dataset including all known and predicted VFs and (ii) refined the gene annotation of the core dataset with controlled vocabularies. Our efforts enhanced the data quality of the VFDB and promoted the usability of the database in the big data era for the bioinformatic mining of the explosively growing data regarding bacterial VFs. Oxford University Press 2016-01-04 2015-11-17 /pmc/articles/PMC4702877/ /pubmed/26578559 http://dx.doi.org/10.1093/nar/gkv1239 Text en © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Database Issue Chen, Lihong Zheng, Dandan Liu, Bo Yang, Jian Jin, Qi VFDB 2016: hierarchical and refined dataset for big data analysis—10 years on |
title | VFDB 2016: hierarchical and refined dataset for big data analysis—10 years on |
title_full | VFDB 2016: hierarchical and refined dataset for big data analysis—10 years on |
title_fullStr | VFDB 2016: hierarchical and refined dataset for big data analysis—10 years on |
title_full_unstemmed | VFDB 2016: hierarchical and refined dataset for big data analysis—10 years on |
title_short | VFDB 2016: hierarchical and refined dataset for big data analysis—10 years on |
title_sort | vfdb 2016: hierarchical and refined dataset for big data analysis—10 years on |
topic | Database Issue |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4702877/ https://www.ncbi.nlm.nih.gov/pubmed/26578559 http://dx.doi.org/10.1093/nar/gkv1239 |
work_keys_str_mv | AT chenlihong vfdb2016hierarchicalandrefineddatasetforbigdataanalysis10yearson AT zhengdandan vfdb2016hierarchicalandrefineddatasetforbigdataanalysis10yearson AT liubo vfdb2016hierarchicalandrefineddatasetforbigdataanalysis10yearson AT yangjian vfdb2016hierarchicalandrefineddatasetforbigdataanalysis10yearson AT jinqi vfdb2016hierarchicalandrefineddatasetforbigdataanalysis10yearson |