Cargando…

A systematic comparison of the MetaCyc and KEGG pathway databases

BACKGROUND: The MetaCyc and KEGG projects have developed large metabolic pathway databases that are used for a variety of applications including genome analysis and metabolic engineering. We present a comparison of the compound, reaction, and pathway content of MetaCyc version 16.0 and a KEGG versio...

Descripción completa

Detalles Bibliográficos
Autores principales: Altman, Tomer, Travers, Michael, Kothari, Anamika, Caspi, Ron, Karp, Peter D
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3665663/
https://www.ncbi.nlm.nih.gov/pubmed/23530693
http://dx.doi.org/10.1186/1471-2105-14-112
_version_ 1782271285945958400
author Altman, Tomer
Travers, Michael
Kothari, Anamika
Caspi, Ron
Karp, Peter D
author_facet Altman, Tomer
Travers, Michael
Kothari, Anamika
Caspi, Ron
Karp, Peter D
author_sort Altman, Tomer
collection PubMed
description BACKGROUND: The MetaCyc and KEGG projects have developed large metabolic pathway databases that are used for a variety of applications including genome analysis and metabolic engineering. We present a comparison of the compound, reaction, and pathway content of MetaCyc version 16.0 and a KEGG version downloaded on Feb-27-2012 to increase understanding of their relative sizes, their degree of overlap, and their scope. To assess their overlap, we must know the correspondences between compounds, reactions, and pathways in MetaCyc, and those in KEGG. We devoted significant effort to computational and manual matching of these entities, and we evaluated the accuracy of the correspondences. RESULTS: KEGG contains 179 module pathways versus 1,846 base pathways in MetaCyc; KEGG contains 237 map pathways versus 296 super pathways in MetaCyc. KEGG pathways contain 3.3 times as many reactions on average as do MetaCyc pathways, and the databases employ different conceptualizations of metabolic pathways. KEGG contains 8,692 reactions versus 10,262 for MetaCyc. 6,174 KEGG reactions are components of KEGG pathways versus 6,348 for MetaCyc. KEGG contains 16,586 compounds versus 11,991 for MetaCyc. 6,912 KEGG compounds act as substrates in KEGG reactions versus 8,891 for MetaCyc. MetaCyc contains a broader set of database attributes than does KEGG, such as relationships from a compound to enzymes that it regulates, identification of spontaneous reactions, and the expected taxonomic range of metabolic pathways. MetaCyc contains many pathways not found in KEGG, from plants, fungi, metazoa, and actinobacteria; KEGG contains pathways not found in MetaCyc, for xenobiotic degradation, glycan metabolism, and metabolism of terpenoids and polyketides. MetaCyc contains fewer unbalanced reactions, which facilitates metabolic modeling such as using flux-balance analysis. MetaCyc includes generic reactions that may be instantiated computationally. CONCLUSIONS: KEGG contains significantly more compounds than does MetaCyc, whereas MetaCyc contains significantly more reactions and pathways than does KEGG, in particular KEGG modules are quite incomplete. The number of reactions occurring in pathways in the two DBs are quite similar.
format Online
Article
Text
id pubmed-3665663
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36656632013-06-05 A systematic comparison of the MetaCyc and KEGG pathway databases Altman, Tomer Travers, Michael Kothari, Anamika Caspi, Ron Karp, Peter D BMC Bioinformatics Research Article BACKGROUND: The MetaCyc and KEGG projects have developed large metabolic pathway databases that are used for a variety of applications including genome analysis and metabolic engineering. We present a comparison of the compound, reaction, and pathway content of MetaCyc version 16.0 and a KEGG version downloaded on Feb-27-2012 to increase understanding of their relative sizes, their degree of overlap, and their scope. To assess their overlap, we must know the correspondences between compounds, reactions, and pathways in MetaCyc, and those in KEGG. We devoted significant effort to computational and manual matching of these entities, and we evaluated the accuracy of the correspondences. RESULTS: KEGG contains 179 module pathways versus 1,846 base pathways in MetaCyc; KEGG contains 237 map pathways versus 296 super pathways in MetaCyc. KEGG pathways contain 3.3 times as many reactions on average as do MetaCyc pathways, and the databases employ different conceptualizations of metabolic pathways. KEGG contains 8,692 reactions versus 10,262 for MetaCyc. 6,174 KEGG reactions are components of KEGG pathways versus 6,348 for MetaCyc. KEGG contains 16,586 compounds versus 11,991 for MetaCyc. 6,912 KEGG compounds act as substrates in KEGG reactions versus 8,891 for MetaCyc. MetaCyc contains a broader set of database attributes than does KEGG, such as relationships from a compound to enzymes that it regulates, identification of spontaneous reactions, and the expected taxonomic range of metabolic pathways. MetaCyc contains many pathways not found in KEGG, from plants, fungi, metazoa, and actinobacteria; KEGG contains pathways not found in MetaCyc, for xenobiotic degradation, glycan metabolism, and metabolism of terpenoids and polyketides. MetaCyc contains fewer unbalanced reactions, which facilitates metabolic modeling such as using flux-balance analysis. MetaCyc includes generic reactions that may be instantiated computationally. CONCLUSIONS: KEGG contains significantly more compounds than does MetaCyc, whereas MetaCyc contains significantly more reactions and pathways than does KEGG, in particular KEGG modules are quite incomplete. The number of reactions occurring in pathways in the two DBs are quite similar. BioMed Central 2013-03-27 /pmc/articles/PMC3665663/ /pubmed/23530693 http://dx.doi.org/10.1186/1471-2105-14-112 Text en Copyright © 2013 Altman et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Altman, Tomer
Travers, Michael
Kothari, Anamika
Caspi, Ron
Karp, Peter D
A systematic comparison of the MetaCyc and KEGG pathway databases
title A systematic comparison of the MetaCyc and KEGG pathway databases
title_full A systematic comparison of the MetaCyc and KEGG pathway databases
title_fullStr A systematic comparison of the MetaCyc and KEGG pathway databases
title_full_unstemmed A systematic comparison of the MetaCyc and KEGG pathway databases
title_short A systematic comparison of the MetaCyc and KEGG pathway databases
title_sort systematic comparison of the metacyc and kegg pathway databases
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3665663/
https://www.ncbi.nlm.nih.gov/pubmed/23530693
http://dx.doi.org/10.1186/1471-2105-14-112
work_keys_str_mv AT altmantomer asystematiccomparisonofthemetacycandkeggpathwaydatabases
AT traversmichael asystematiccomparisonofthemetacycandkeggpathwaydatabases
AT kotharianamika asystematiccomparisonofthemetacycandkeggpathwaydatabases
AT caspiron asystematiccomparisonofthemetacycandkeggpathwaydatabases
AT karppeterd asystematiccomparisonofthemetacycandkeggpathwaydatabases
AT altmantomer systematiccomparisonofthemetacycandkeggpathwaydatabases
AT traversmichael systematiccomparisonofthemetacycandkeggpathwaydatabases
AT kotharianamika systematiccomparisonofthemetacycandkeggpathwaydatabases
AT caspiron systematiccomparisonofthemetacycandkeggpathwaydatabases
AT karppeterd systematiccomparisonofthemetacycandkeggpathwaydatabases