Cargando…

MetaGeneBank: a standardized database to study deep sequenced metagenomic data from human fecal specimen

BACKGROUND: Microbiome big data from population-scale cohorts holds the key to unleash the power of microbiomes to overcome critical challenges in disease control, treatment and precision medicine. However, variations introduced during data generation and processing limit the comparisons among indep...

Descripción completa

Detalles Bibliográficos
Autores principales: Shao, Li, Liao, Jie, Qian, Jingyang, Chen, Wenbin, Fan, Xiaohui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8485520/
https://www.ncbi.nlm.nih.gov/pubmed/34592929
http://dx.doi.org/10.1186/s12866-021-02321-z
_version_ 1784577552603414528
author Shao, Li
Liao, Jie
Qian, Jingyang
Chen, Wenbin
Fan, Xiaohui
author_facet Shao, Li
Liao, Jie
Qian, Jingyang
Chen, Wenbin
Fan, Xiaohui
author_sort Shao, Li
collection PubMed
description BACKGROUND: Microbiome big data from population-scale cohorts holds the key to unleash the power of microbiomes to overcome critical challenges in disease control, treatment and precision medicine. However, variations introduced during data generation and processing limit the comparisons among independent studies in respect of interpretability. Although multiple databases have been constructed as platforms for data reuse, they are of limited value since only raw sequencing files are considered. DESCRIPTION: Here, we present MetaGeneBank, a standardized database that provides details on sample collection and sequencing, and abundances of genes, microbiota and molecular functions for 4470 raw sequencing files (over 12 TB) collected from 16 studies covering over 10 types of diseases and 14 countries using a unified data-processing pipeline. The incorporation of tools that enable browsing and searching with descriptive attributes, gene sequences, microbiota and functions makes the database user-friendly. We found that the source of specimen contributes more than sequencing centers or platforms to the variations of microbiota. Special attention should be paid when re-analyzing sequencing files from different countries. CONCLUSIONS: Collectively, MetaGeneBank provides a gateway to utilize the untapped potential of gut metagenomic data in helping fighting against human diseases. With the continuous updating of the database in terms of data volume, data types and sample types, MetaGeneBank would undoubtedly be the benchmarking database in the future in respect of data reuse, and would be valuable in translational science. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12866-021-02321-z.
format Online
Article
Text
id pubmed-8485520
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-84855202021-10-04 MetaGeneBank: a standardized database to study deep sequenced metagenomic data from human fecal specimen Shao, Li Liao, Jie Qian, Jingyang Chen, Wenbin Fan, Xiaohui BMC Microbiol Database BACKGROUND: Microbiome big data from population-scale cohorts holds the key to unleash the power of microbiomes to overcome critical challenges in disease control, treatment and precision medicine. However, variations introduced during data generation and processing limit the comparisons among independent studies in respect of interpretability. Although multiple databases have been constructed as platforms for data reuse, they are of limited value since only raw sequencing files are considered. DESCRIPTION: Here, we present MetaGeneBank, a standardized database that provides details on sample collection and sequencing, and abundances of genes, microbiota and molecular functions for 4470 raw sequencing files (over 12 TB) collected from 16 studies covering over 10 types of diseases and 14 countries using a unified data-processing pipeline. The incorporation of tools that enable browsing and searching with descriptive attributes, gene sequences, microbiota and functions makes the database user-friendly. We found that the source of specimen contributes more than sequencing centers or platforms to the variations of microbiota. Special attention should be paid when re-analyzing sequencing files from different countries. CONCLUSIONS: Collectively, MetaGeneBank provides a gateway to utilize the untapped potential of gut metagenomic data in helping fighting against human diseases. With the continuous updating of the database in terms of data volume, data types and sample types, MetaGeneBank would undoubtedly be the benchmarking database in the future in respect of data reuse, and would be valuable in translational science. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12866-021-02321-z. BioMed Central 2021-09-30 /pmc/articles/PMC8485520/ /pubmed/34592929 http://dx.doi.org/10.1186/s12866-021-02321-z Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Database
Shao, Li
Liao, Jie
Qian, Jingyang
Chen, Wenbin
Fan, Xiaohui
MetaGeneBank: a standardized database to study deep sequenced metagenomic data from human fecal specimen
title MetaGeneBank: a standardized database to study deep sequenced metagenomic data from human fecal specimen
title_full MetaGeneBank: a standardized database to study deep sequenced metagenomic data from human fecal specimen
title_fullStr MetaGeneBank: a standardized database to study deep sequenced metagenomic data from human fecal specimen
title_full_unstemmed MetaGeneBank: a standardized database to study deep sequenced metagenomic data from human fecal specimen
title_short MetaGeneBank: a standardized database to study deep sequenced metagenomic data from human fecal specimen
title_sort metagenebank: a standardized database to study deep sequenced metagenomic data from human fecal specimen
topic Database
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8485520/
https://www.ncbi.nlm.nih.gov/pubmed/34592929
http://dx.doi.org/10.1186/s12866-021-02321-z
work_keys_str_mv AT shaoli metagenebankastandardizeddatabasetostudydeepsequencedmetagenomicdatafromhumanfecalspecimen
AT liaojie metagenebankastandardizeddatabasetostudydeepsequencedmetagenomicdatafromhumanfecalspecimen
AT qianjingyang metagenebankastandardizeddatabasetostudydeepsequencedmetagenomicdatafromhumanfecalspecimen
AT chenwenbin metagenebankastandardizeddatabasetostudydeepsequencedmetagenomicdatafromhumanfecalspecimen
AT fanxiaohui metagenebankastandardizeddatabasetostudydeepsequencedmetagenomicdatafromhumanfecalspecimen