Cargando…

Domain-based small molecule binding site annotation

BACKGROUND: Accurate small molecule binding site information for a protein can facilitate studies in drug docking, drug discovery and function prediction, but small molecule binding site protein sequence annotation is sparse. The Small Molecule Interaction Database (SMID), a database of protein doma...

Descripción completa

Detalles Bibliográficos
Autores principales: Snyder, Kevin A, Feldman, Howard J, Dumontier, Michel, Salama, John J, Hogue, Christopher WV
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1435939/
https://www.ncbi.nlm.nih.gov/pubmed/16545112
http://dx.doi.org/10.1186/1471-2105-7-152
_version_ 1782127299840180224
author Snyder, Kevin A
Feldman, Howard J
Dumontier, Michel
Salama, John J
Hogue, Christopher WV
author_facet Snyder, Kevin A
Feldman, Howard J
Dumontier, Michel
Salama, John J
Hogue, Christopher WV
author_sort Snyder, Kevin A
collection PubMed
description BACKGROUND: Accurate small molecule binding site information for a protein can facilitate studies in drug docking, drug discovery and function prediction, but small molecule binding site protein sequence annotation is sparse. The Small Molecule Interaction Database (SMID), a database of protein domain-small molecule interactions, was created using structural data from the Protein Data Bank (PDB). More importantly it provides a means to predict small molecule binding sites on proteins with a known or unknown structure and unlike prior approaches, removes large numbers of false positive hits arising from transitive alignment errors, non-biologically significant small molecules and crystallographic conditions that overpredict ion binding sites. DESCRIPTION: Using a set of co-crystallized protein-small molecule structures as a starting point, SMID interactions were generated by identifying protein domains that bind to small molecules, using NCBI's Reverse Position Specific BLAST (RPS-BLAST) algorithm. SMID records are available for viewing at . The SMID-BLAST tool provides accurate transitive annotation of small-molecule binding sites for proteins not found in the PDB. Given a protein sequence, SMID-BLAST identifies domains using RPS-BLAST and then lists potential small molecule ligands based on SMID records, as well as their aligned binding sites. A heuristic ligand score is calculated based on E-value, ligand residue identity and domain entropy to assign a level of confidence to hits found. SMID-BLAST predictions were validated against a set of 793 experimental small molecule interactions from the PDB, of which 472 (60%) of predicted interactions identically matched the experimental small molecule and of these, 344 had greater than 80% of the binding site residues correctly identified. Further, we estimate that 45% of predictions which were not observed in the PDB validation set may be true positives. CONCLUSION: By focusing on protein domain-small molecule interactions, SMID is able to cluster similar interactions and detect subtle binding patterns that would not otherwise be obvious. Using SMID-BLAST, small molecule targets can be predicted for any protein sequence, with the only limitation being that the small molecule must exist in the PDB. Validation results and specific examples within illustrate that SMID-BLAST has a high degree of accuracy in terms of predicting both the small molecule ligand and binding site residue positions for a query protein.
format Text
id pubmed-1435939
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-14359392006-04-21 Domain-based small molecule binding site annotation Snyder, Kevin A Feldman, Howard J Dumontier, Michel Salama, John J Hogue, Christopher WV BMC Bioinformatics Database BACKGROUND: Accurate small molecule binding site information for a protein can facilitate studies in drug docking, drug discovery and function prediction, but small molecule binding site protein sequence annotation is sparse. The Small Molecule Interaction Database (SMID), a database of protein domain-small molecule interactions, was created using structural data from the Protein Data Bank (PDB). More importantly it provides a means to predict small molecule binding sites on proteins with a known or unknown structure and unlike prior approaches, removes large numbers of false positive hits arising from transitive alignment errors, non-biologically significant small molecules and crystallographic conditions that overpredict ion binding sites. DESCRIPTION: Using a set of co-crystallized protein-small molecule structures as a starting point, SMID interactions were generated by identifying protein domains that bind to small molecules, using NCBI's Reverse Position Specific BLAST (RPS-BLAST) algorithm. SMID records are available for viewing at . The SMID-BLAST tool provides accurate transitive annotation of small-molecule binding sites for proteins not found in the PDB. Given a protein sequence, SMID-BLAST identifies domains using RPS-BLAST and then lists potential small molecule ligands based on SMID records, as well as their aligned binding sites. A heuristic ligand score is calculated based on E-value, ligand residue identity and domain entropy to assign a level of confidence to hits found. SMID-BLAST predictions were validated against a set of 793 experimental small molecule interactions from the PDB, of which 472 (60%) of predicted interactions identically matched the experimental small molecule and of these, 344 had greater than 80% of the binding site residues correctly identified. Further, we estimate that 45% of predictions which were not observed in the PDB validation set may be true positives. CONCLUSION: By focusing on protein domain-small molecule interactions, SMID is able to cluster similar interactions and detect subtle binding patterns that would not otherwise be obvious. Using SMID-BLAST, small molecule targets can be predicted for any protein sequence, with the only limitation being that the small molecule must exist in the PDB. Validation results and specific examples within illustrate that SMID-BLAST has a high degree of accuracy in terms of predicting both the small molecule ligand and binding site residue positions for a query protein. BioMed Central 2006-03-17 /pmc/articles/PMC1435939/ /pubmed/16545112 http://dx.doi.org/10.1186/1471-2105-7-152 Text en Copyright © 2006 Snyder et al; licensee BioMed Central Ltd.
spellingShingle Database
Snyder, Kevin A
Feldman, Howard J
Dumontier, Michel
Salama, John J
Hogue, Christopher WV
Domain-based small molecule binding site annotation
title Domain-based small molecule binding site annotation
title_full Domain-based small molecule binding site annotation
title_fullStr Domain-based small molecule binding site annotation
title_full_unstemmed Domain-based small molecule binding site annotation
title_short Domain-based small molecule binding site annotation
title_sort domain-based small molecule binding site annotation
topic Database
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1435939/
https://www.ncbi.nlm.nih.gov/pubmed/16545112
http://dx.doi.org/10.1186/1471-2105-7-152
work_keys_str_mv AT snyderkevina domainbasedsmallmoleculebindingsiteannotation
AT feldmanhowardj domainbasedsmallmoleculebindingsiteannotation
AT dumontiermichel domainbasedsmallmoleculebindingsiteannotation
AT salamajohnj domainbasedsmallmoleculebindingsiteannotation
AT hoguechristopherwv domainbasedsmallmoleculebindingsiteannotation