Cargando…
iRefIndex: A consolidated protein interaction database with provenance
BACKGROUND: Interaction data for a given protein may be spread across multiple databases. We set out to create a unifying index that would facilitate searching for these data and that would group together redundant interaction data while recording the methods used to perform this grouping. RESULTS:...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2573892/ https://www.ncbi.nlm.nih.gov/pubmed/18823568 http://dx.doi.org/10.1186/1471-2105-9-405 |
_version_ | 1782160284999221248 |
---|---|
author | Razick, Sabry Magklaras, George Donaldson, Ian M |
author_facet | Razick, Sabry Magklaras, George Donaldson, Ian M |
author_sort | Razick, Sabry |
collection | PubMed |
description | BACKGROUND: Interaction data for a given protein may be spread across multiple databases. We set out to create a unifying index that would facilitate searching for these data and that would group together redundant interaction data while recording the methods used to perform this grouping. RESULTS: We present a method to generate a key for a protein interaction record and a key for each participant protein. These keys may be generated by anyone using only the primary sequence of the proteins, their taxonomy identifiers and the Secure Hash Algorithm. Two interaction records will have identical keys if they refer to the same set of identical protein sequences and taxonomy identifiers. We define records with identical keys as a redundant group. Our method required that we map protein database references found in interaction records to current protein sequence records. Operations performed during this mapping are described by a mapping score that may provide valuable feedback to source interaction databases on problematic references that are malformed, deprecated, ambiguous or unfound. Keys for protein participants allow for retrieval of interaction information independent of the protein references used in the original records. CONCLUSION: We have applied our method to protein interaction records from BIND, BioGrid, DIP, HPRD, IntAct, MINT, MPact, MPPI and OPHID. The resulting interaction reference index is provided in PSI-MITAB 2.5 format at . This index may form the basis of alternative redundant groupings based on gene identifiers or near sequence identity groupings. |
format | Text |
id | pubmed-2573892 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-25738922008-10-28 iRefIndex: A consolidated protein interaction database with provenance Razick, Sabry Magklaras, George Donaldson, Ian M BMC Bioinformatics Research Article BACKGROUND: Interaction data for a given protein may be spread across multiple databases. We set out to create a unifying index that would facilitate searching for these data and that would group together redundant interaction data while recording the methods used to perform this grouping. RESULTS: We present a method to generate a key for a protein interaction record and a key for each participant protein. These keys may be generated by anyone using only the primary sequence of the proteins, their taxonomy identifiers and the Secure Hash Algorithm. Two interaction records will have identical keys if they refer to the same set of identical protein sequences and taxonomy identifiers. We define records with identical keys as a redundant group. Our method required that we map protein database references found in interaction records to current protein sequence records. Operations performed during this mapping are described by a mapping score that may provide valuable feedback to source interaction databases on problematic references that are malformed, deprecated, ambiguous or unfound. Keys for protein participants allow for retrieval of interaction information independent of the protein references used in the original records. CONCLUSION: We have applied our method to protein interaction records from BIND, BioGrid, DIP, HPRD, IntAct, MINT, MPact, MPPI and OPHID. The resulting interaction reference index is provided in PSI-MITAB 2.5 format at . This index may form the basis of alternative redundant groupings based on gene identifiers or near sequence identity groupings. BioMed Central 2008-09-30 /pmc/articles/PMC2573892/ /pubmed/18823568 http://dx.doi.org/10.1186/1471-2105-9-405 Text en Copyright © 2008 Razick et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Razick, Sabry Magklaras, George Donaldson, Ian M iRefIndex: A consolidated protein interaction database with provenance |
title | iRefIndex: A consolidated protein interaction database with provenance |
title_full | iRefIndex: A consolidated protein interaction database with provenance |
title_fullStr | iRefIndex: A consolidated protein interaction database with provenance |
title_full_unstemmed | iRefIndex: A consolidated protein interaction database with provenance |
title_short | iRefIndex: A consolidated protein interaction database with provenance |
title_sort | irefindex: a consolidated protein interaction database with provenance |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2573892/ https://www.ncbi.nlm.nih.gov/pubmed/18823568 http://dx.doi.org/10.1186/1471-2105-9-405 |
work_keys_str_mv | AT razicksabry irefindexaconsolidatedproteininteractiondatabasewithprovenance AT magklarasgeorge irefindexaconsolidatedproteininteractiondatabasewithprovenance AT donaldsonianm irefindexaconsolidatedproteininteractiondatabasewithprovenance |