Cargando…

LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics

BACKGROUND: A key abstraction in representing proteomics knowledge is the notion of unique identifiers for individual entities (e.g. proteins) and the massive graph of relationships among them. These relationships are sometimes simple (e.g. synonyms) but are often more complex (e.g. one-to-many rela...

Descripción completa

Detalles Bibliográficos
Autores principales: Smith, Andrew K, Cheung, Kei-Hoi, Yip, Kevin Y, Schultz, Martin, Gerstein, Mark B
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1892102/
https://www.ncbi.nlm.nih.gov/pubmed/17493288
http://dx.doi.org/10.1186/1471-2105-8-S3-S5
_version_ 1782133827753213952
author Smith, Andrew K
Cheung, Kei-Hoi
Yip, Kevin Y
Schultz, Martin
Gerstein, Mark B
author_facet Smith, Andrew K
Cheung, Kei-Hoi
Yip, Kevin Y
Schultz, Martin
Gerstein, Mark B
author_sort Smith, Andrew K
collection PubMed
description BACKGROUND: A key abstraction in representing proteomics knowledge is the notion of unique identifiers for individual entities (e.g. proteins) and the massive graph of relationships among them. These relationships are sometimes simple (e.g. synonyms) but are often more complex (e.g. one-to-many relationships in protein family membership). RESULTS: We have built a software system called LinkHub using Semantic Web RDF that manages the graph of identifier relationships and allows exploration with a variety of interfaces. For efficiency, we also provide relational-database access and translation between the relational and RDF versions. LinkHub is practically useful in creating small, local hubs on common topics and then connecting these to major portals in a federated architecture; we have used LinkHub to establish such a relationship between UniProt and the North East Structural Genomics Consortium. LinkHub also facilitates queries and access to information and documents related to identifiers spread across multiple databases, acting as "connecting glue" between different identifier spaces. We demonstrate this with example queries discovering "interologs" of yeast protein interactions in the worm and exploring the relationship between gene essentiality and pseudogene content. We also show how "protein family based" retrieval of documents can be achieved. LinkHub is available at hub.gersteinlab.org and hub.nesg.org with supplement, database models and full-source code. CONCLUSION: LinkHub leverages Semantic Web standards-based integrated data to provide novel information retrieval to identifier-related documents through relational graph queries, simplifies and manages connections to major hubs such as UniProt, and provides useful interactive and query interfaces for exploring the integrated data.
format Text
id pubmed-1892102
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18921022007-06-15 LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics Smith, Andrew K Cheung, Kei-Hoi Yip, Kevin Y Schultz, Martin Gerstein, Mark B BMC Bioinformatics Research BACKGROUND: A key abstraction in representing proteomics knowledge is the notion of unique identifiers for individual entities (e.g. proteins) and the massive graph of relationships among them. These relationships are sometimes simple (e.g. synonyms) but are often more complex (e.g. one-to-many relationships in protein family membership). RESULTS: We have built a software system called LinkHub using Semantic Web RDF that manages the graph of identifier relationships and allows exploration with a variety of interfaces. For efficiency, we also provide relational-database access and translation between the relational and RDF versions. LinkHub is practically useful in creating small, local hubs on common topics and then connecting these to major portals in a federated architecture; we have used LinkHub to establish such a relationship between UniProt and the North East Structural Genomics Consortium. LinkHub also facilitates queries and access to information and documents related to identifiers spread across multiple databases, acting as "connecting glue" between different identifier spaces. We demonstrate this with example queries discovering "interologs" of yeast protein interactions in the worm and exploring the relationship between gene essentiality and pseudogene content. We also show how "protein family based" retrieval of documents can be achieved. LinkHub is available at hub.gersteinlab.org and hub.nesg.org with supplement, database models and full-source code. CONCLUSION: LinkHub leverages Semantic Web standards-based integrated data to provide novel information retrieval to identifier-related documents through relational graph queries, simplifies and manages connections to major hubs such as UniProt, and provides useful interactive and query interfaces for exploring the integrated data. BioMed Central 2007-05-09 /pmc/articles/PMC1892102/ /pubmed/17493288 http://dx.doi.org/10.1186/1471-2105-8-S3-S5 Text en Copyright © 2007 Smith et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Smith, Andrew K
Cheung, Kei-Hoi
Yip, Kevin Y
Schultz, Martin
Gerstein, Mark B
LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics
title LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics
title_full LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics
title_fullStr LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics
title_full_unstemmed LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics
title_short LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics
title_sort linkhub: a semantic web system that facilitates cross-database queries and information retrieval in proteomics
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1892102/
https://www.ncbi.nlm.nih.gov/pubmed/17493288
http://dx.doi.org/10.1186/1471-2105-8-S3-S5
work_keys_str_mv AT smithandrewk linkhubasemanticwebsystemthatfacilitatescrossdatabasequeriesandinformationretrievalinproteomics
AT cheungkeihoi linkhubasemanticwebsystemthatfacilitatescrossdatabasequeriesandinformationretrievalinproteomics
AT yipkeviny linkhubasemanticwebsystemthatfacilitatescrossdatabasequeriesandinformationretrievalinproteomics
AT schultzmartin linkhubasemanticwebsystemthatfacilitatescrossdatabasequeriesandinformationretrievalinproteomics
AT gersteinmarkb linkhubasemanticwebsystemthatfacilitatescrossdatabasequeriesandinformationretrievalinproteomics