Cargando…

Database limitations for studying the human gut microbiome

BACKGROUND: In the last twenty years, new methodologies have made possible the gathering of large amounts of data concerning the genetic information and metabolic functions associated to the human gut microbiome. In spite of that, processing all this data available might not be the simplest of tasks...

Descripción completa

Detalles Bibliográficos
Autores principales: Dias, Camila K, Starke, Robert, Pylro, Victor S., Morais, Daniel K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924478/
https://www.ncbi.nlm.nih.gov/pubmed/33816940
http://dx.doi.org/10.7717/peerj-cs.289
_version_ 1783659098623967232
author Dias, Camila K
Starke, Robert
Pylro, Victor S.
Morais, Daniel K.
author_facet Dias, Camila K
Starke, Robert
Pylro, Victor S.
Morais, Daniel K.
author_sort Dias, Camila K
collection PubMed
description BACKGROUND: In the last twenty years, new methodologies have made possible the gathering of large amounts of data concerning the genetic information and metabolic functions associated to the human gut microbiome. In spite of that, processing all this data available might not be the simplest of tasks, which could result in an excess of information awaiting proper annotation. This assessment intended on evaluating how well respected databases could describe a mock human gut microbiome. METHODS: In this work, we critically evaluate the output of the cross–reference between the Uniprot Knowledge Base (Uniprot KB) and the Kyoto Encyclopedia of Genes and Genomes Orthologs (KEGG Orthologs) or the evolutionary genealogy of genes: Non-supervised Orthologous groups (EggNOG) databases regarding a list of species that were previously found in the human gut microbiome. RESULTS: From a list which contemplates 131 species and 52 genera, 53 species and 40 genera had corresponding entries for KEGG Database and 82 species and 47 genera had corresponding entries for EggNOG Database. Moreover, we present the KEGG Orthologs (KOs) and EggNOG Orthologs (NOGs) entries associated to the search as their distribution over species and genera and lists of functions that appeared in many species or genera, the “core” functions of the human gut microbiome. We also present the relative abundance of KOs and NOGs throughout phyla and genera. Lastly, we expose a variance found between searches with different arguments on the database entries. Inferring functionality based on cross-referencing UniProt and KEGG or EggNOG can be lackluster due to the low number of annotated species in Uniprot and due to the lower number of functions affiliated to the majority of these species. Additionally, the EggNOG database showed greater performance for a cross-search with Uniprot about a mock human gut microbiome. Notwithstanding, efforts targeting cultivation, single-cell sequencing or the reconstruction of high-quality metagenome-assembled genomes (MAG) and their annotation are needed to allow the use of these databases for inferring functionality in human gut microbiome studies.
format Online
Article
Text
id pubmed-7924478
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-79244782021-04-02 Database limitations for studying the human gut microbiome Dias, Camila K Starke, Robert Pylro, Victor S. Morais, Daniel K. PeerJ Comput Sci Bioinformatics BACKGROUND: In the last twenty years, new methodologies have made possible the gathering of large amounts of data concerning the genetic information and metabolic functions associated to the human gut microbiome. In spite of that, processing all this data available might not be the simplest of tasks, which could result in an excess of information awaiting proper annotation. This assessment intended on evaluating how well respected databases could describe a mock human gut microbiome. METHODS: In this work, we critically evaluate the output of the cross–reference between the Uniprot Knowledge Base (Uniprot KB) and the Kyoto Encyclopedia of Genes and Genomes Orthologs (KEGG Orthologs) or the evolutionary genealogy of genes: Non-supervised Orthologous groups (EggNOG) databases regarding a list of species that were previously found in the human gut microbiome. RESULTS: From a list which contemplates 131 species and 52 genera, 53 species and 40 genera had corresponding entries for KEGG Database and 82 species and 47 genera had corresponding entries for EggNOG Database. Moreover, we present the KEGG Orthologs (KOs) and EggNOG Orthologs (NOGs) entries associated to the search as their distribution over species and genera and lists of functions that appeared in many species or genera, the “core” functions of the human gut microbiome. We also present the relative abundance of KOs and NOGs throughout phyla and genera. Lastly, we expose a variance found between searches with different arguments on the database entries. Inferring functionality based on cross-referencing UniProt and KEGG or EggNOG can be lackluster due to the low number of annotated species in Uniprot and due to the lower number of functions affiliated to the majority of these species. Additionally, the EggNOG database showed greater performance for a cross-search with Uniprot about a mock human gut microbiome. Notwithstanding, efforts targeting cultivation, single-cell sequencing or the reconstruction of high-quality metagenome-assembled genomes (MAG) and their annotation are needed to allow the use of these databases for inferring functionality in human gut microbiome studies. PeerJ Inc. 2020-08-17 /pmc/articles/PMC7924478/ /pubmed/33816940 http://dx.doi.org/10.7717/peerj-cs.289 Text en ©2020 Dias et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Dias, Camila K
Starke, Robert
Pylro, Victor S.
Morais, Daniel K.
Database limitations for studying the human gut microbiome
title Database limitations for studying the human gut microbiome
title_full Database limitations for studying the human gut microbiome
title_fullStr Database limitations for studying the human gut microbiome
title_full_unstemmed Database limitations for studying the human gut microbiome
title_short Database limitations for studying the human gut microbiome
title_sort database limitations for studying the human gut microbiome
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924478/
https://www.ncbi.nlm.nih.gov/pubmed/33816940
http://dx.doi.org/10.7717/peerj-cs.289
work_keys_str_mv AT diascamilak databaselimitationsforstudyingthehumangutmicrobiome
AT starkerobert databaselimitationsforstudyingthehumangutmicrobiome
AT pylrovictors databaselimitationsforstudyingthehumangutmicrobiome
AT moraisdanielk databaselimitationsforstudyingthehumangutmicrobiome