Cargando…

biochem4j: Integrated and extensible biochemical knowledge through graph databases

Biologists and biochemists have at their disposal a number of excellent, publicly available data resources such as UniProt, KEGG, and NCBI Taxonomy, which catalogue biological entities. Despite the usefulness of these resources, they remain fundamentally unconnected. While links may appear between e...

Descripción completa

Detalles Bibliográficos
Autores principales: Swainston, Neil, Batista-Navarro, Riza, Carbonell, Pablo, Dobson, Paul D., Dunstan, Mark, Jervis, Adrian J., Vinaixa, Maria, Williams, Alan R., Ananiadou, Sophia, Faulon, Jean-Loup, Mendes, Pedro, Kell, Douglas B., Scrutton, Nigel S., Breitling, Rainer
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5510799/
https://www.ncbi.nlm.nih.gov/pubmed/28708831
http://dx.doi.org/10.1371/journal.pone.0179130
_version_ 1783250227295158272
author Swainston, Neil
Batista-Navarro, Riza
Carbonell, Pablo
Dobson, Paul D.
Dunstan, Mark
Jervis, Adrian J.
Vinaixa, Maria
Williams, Alan R.
Ananiadou, Sophia
Faulon, Jean-Loup
Mendes, Pedro
Kell, Douglas B.
Scrutton, Nigel S.
Breitling, Rainer
author_facet Swainston, Neil
Batista-Navarro, Riza
Carbonell, Pablo
Dobson, Paul D.
Dunstan, Mark
Jervis, Adrian J.
Vinaixa, Maria
Williams, Alan R.
Ananiadou, Sophia
Faulon, Jean-Loup
Mendes, Pedro
Kell, Douglas B.
Scrutton, Nigel S.
Breitling, Rainer
author_sort Swainston, Neil
collection PubMed
description Biologists and biochemists have at their disposal a number of excellent, publicly available data resources such as UniProt, KEGG, and NCBI Taxonomy, which catalogue biological entities. Despite the usefulness of these resources, they remain fundamentally unconnected. While links may appear between entries across these databases, users are typically only able to follow such links by manual browsing or through specialised workflows. Although many of the resources provide web-service interfaces for computational access, performing federated queries across databases remains a non-trivial but essential activity in interdisciplinary systems and synthetic biology programmes. What is needed are integrated repositories to catalogue both biological entities and–crucially–the relationships between them. Such a resource should be extensible, such that newly discovered relationships–for example, those between novel, synthetic enzymes and non-natural products–can be added over time. With the introduction of graph databases, the barrier to the rapid generation, extension and querying of such a resource has been lowered considerably. With a particular focus on metabolic engineering as an illustrative application domain, biochem4j, freely available at http://biochem4j.org, is introduced to provide an integrated, queryable database that warehouses chemical, reaction, enzyme and taxonomic data from a range of reliable resources. The biochem4j framework establishes a starting point for the flexible integration and exploitation of an ever-wider range of biological data sources, from public databases to laboratory-specific experimental datasets, for the benefit of systems biologists, biosystems engineers and the wider community of molecular biologists and biological chemists.
format Online
Article
Text
id pubmed-5510799
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-55107992017-08-07 biochem4j: Integrated and extensible biochemical knowledge through graph databases Swainston, Neil Batista-Navarro, Riza Carbonell, Pablo Dobson, Paul D. Dunstan, Mark Jervis, Adrian J. Vinaixa, Maria Williams, Alan R. Ananiadou, Sophia Faulon, Jean-Loup Mendes, Pedro Kell, Douglas B. Scrutton, Nigel S. Breitling, Rainer PLoS One Research Article Biologists and biochemists have at their disposal a number of excellent, publicly available data resources such as UniProt, KEGG, and NCBI Taxonomy, which catalogue biological entities. Despite the usefulness of these resources, they remain fundamentally unconnected. While links may appear between entries across these databases, users are typically only able to follow such links by manual browsing or through specialised workflows. Although many of the resources provide web-service interfaces for computational access, performing federated queries across databases remains a non-trivial but essential activity in interdisciplinary systems and synthetic biology programmes. What is needed are integrated repositories to catalogue both biological entities and–crucially–the relationships between them. Such a resource should be extensible, such that newly discovered relationships–for example, those between novel, synthetic enzymes and non-natural products–can be added over time. With the introduction of graph databases, the barrier to the rapid generation, extension and querying of such a resource has been lowered considerably. With a particular focus on metabolic engineering as an illustrative application domain, biochem4j, freely available at http://biochem4j.org, is introduced to provide an integrated, queryable database that warehouses chemical, reaction, enzyme and taxonomic data from a range of reliable resources. The biochem4j framework establishes a starting point for the flexible integration and exploitation of an ever-wider range of biological data sources, from public databases to laboratory-specific experimental datasets, for the benefit of systems biologists, biosystems engineers and the wider community of molecular biologists and biological chemists. Public Library of Science 2017-07-14 /pmc/articles/PMC5510799/ /pubmed/28708831 http://dx.doi.org/10.1371/journal.pone.0179130 Text en © 2017 Swainston et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Swainston, Neil
Batista-Navarro, Riza
Carbonell, Pablo
Dobson, Paul D.
Dunstan, Mark
Jervis, Adrian J.
Vinaixa, Maria
Williams, Alan R.
Ananiadou, Sophia
Faulon, Jean-Loup
Mendes, Pedro
Kell, Douglas B.
Scrutton, Nigel S.
Breitling, Rainer
biochem4j: Integrated and extensible biochemical knowledge through graph databases
title biochem4j: Integrated and extensible biochemical knowledge through graph databases
title_full biochem4j: Integrated and extensible biochemical knowledge through graph databases
title_fullStr biochem4j: Integrated and extensible biochemical knowledge through graph databases
title_full_unstemmed biochem4j: Integrated and extensible biochemical knowledge through graph databases
title_short biochem4j: Integrated and extensible biochemical knowledge through graph databases
title_sort biochem4j: integrated and extensible biochemical knowledge through graph databases
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5510799/
https://www.ncbi.nlm.nih.gov/pubmed/28708831
http://dx.doi.org/10.1371/journal.pone.0179130
work_keys_str_mv AT swainstonneil biochem4jintegratedandextensiblebiochemicalknowledgethroughgraphdatabases
AT batistanavarroriza biochem4jintegratedandextensiblebiochemicalknowledgethroughgraphdatabases
AT carbonellpablo biochem4jintegratedandextensiblebiochemicalknowledgethroughgraphdatabases
AT dobsonpauld biochem4jintegratedandextensiblebiochemicalknowledgethroughgraphdatabases
AT dunstanmark biochem4jintegratedandextensiblebiochemicalknowledgethroughgraphdatabases
AT jervisadrianj biochem4jintegratedandextensiblebiochemicalknowledgethroughgraphdatabases
AT vinaixamaria biochem4jintegratedandextensiblebiochemicalknowledgethroughgraphdatabases
AT williamsalanr biochem4jintegratedandextensiblebiochemicalknowledgethroughgraphdatabases
AT ananiadousophia biochem4jintegratedandextensiblebiochemicalknowledgethroughgraphdatabases
AT faulonjeanloup biochem4jintegratedandextensiblebiochemicalknowledgethroughgraphdatabases
AT mendespedro biochem4jintegratedandextensiblebiochemicalknowledgethroughgraphdatabases
AT kelldouglasb biochem4jintegratedandextensiblebiochemicalknowledgethroughgraphdatabases
AT scruttonnigels biochem4jintegratedandextensiblebiochemicalknowledgethroughgraphdatabases
AT breitlingrainer biochem4jintegratedandextensiblebiochemicalknowledgethroughgraphdatabases