Cargando…

Setting up a large set of protein-ligand PDB complexes for the development and validation of knowledge-based docking algorithms

BACKGROUND: The number of algorithms available to predict ligand-protein interactions is large and ever-increasing. The number of test cases used to validate these methods is usually small and problem dependent. Recently, several databases have been released for further understanding of protein-liga...

Descripción completa

Detalles Bibliográficos
Autores principales: Diago, Luis A, Morell, Persy, Aguilera, Longendri, Moreno, Ernesto
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2008766/
https://www.ncbi.nlm.nih.gov/pubmed/17718923
http://dx.doi.org/10.1186/1471-2105-8-310
_version_ 1782135982809677824
author Diago, Luis A
Morell, Persy
Aguilera, Longendri
Moreno, Ernesto
author_facet Diago, Luis A
Morell, Persy
Aguilera, Longendri
Moreno, Ernesto
author_sort Diago, Luis A
collection PubMed
description BACKGROUND: The number of algorithms available to predict ligand-protein interactions is large and ever-increasing. The number of test cases used to validate these methods is usually small and problem dependent. Recently, several databases have been released for further understanding of protein-ligand interactions, having the Protein Data Bank as backend support. Nevertheless, it appears to be difficult to test docking methods on a large variety of complexes. In this paper we report the development of a new database of protein-ligand complexes tailored for testing of docking algorithms. METHODS: Using a new definition of molecular contact, small ligands contained in the 2005 PDB edition were identified and processed. The database was enriched in molecular properties. In particular, an automated typing of ligand atoms was performed. A filtering procedure was applied to select a non-redundant dataset of complexes. Data mining was performed to obtain information on the frequencies of different types of atomic contacts. Docking simulations were run with the program DOCK. RESULTS: We compiled a large database of small ligand-protein complexes, enriched with different calculated properties, that currently contains more than 6000 non-redundant structures. As an example to demonstrate the value of the new database, we derived a new set of chemical matching rules to be used in the context of the program DOCK, based on contact frequencies between ligand atoms and points representing the protein surface, and proved their enhanced efficiency with respect to the default set of rules included in that program. CONCLUSION: The new database constitutes a valuable resource for the development of knowledge-based docking algorithms and for testing docking programs on large sets of protein-ligand complexes. The new chemical matching rules proposed in this work significantly increase the success rate in DOCKing simulations. The database developed in this work is available at .
format Text
id pubmed-2008766
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-20087662007-10-10 Setting up a large set of protein-ligand PDB complexes for the development and validation of knowledge-based docking algorithms Diago, Luis A Morell, Persy Aguilera, Longendri Moreno, Ernesto BMC Bioinformatics Research Article BACKGROUND: The number of algorithms available to predict ligand-protein interactions is large and ever-increasing. The number of test cases used to validate these methods is usually small and problem dependent. Recently, several databases have been released for further understanding of protein-ligand interactions, having the Protein Data Bank as backend support. Nevertheless, it appears to be difficult to test docking methods on a large variety of complexes. In this paper we report the development of a new database of protein-ligand complexes tailored for testing of docking algorithms. METHODS: Using a new definition of molecular contact, small ligands contained in the 2005 PDB edition were identified and processed. The database was enriched in molecular properties. In particular, an automated typing of ligand atoms was performed. A filtering procedure was applied to select a non-redundant dataset of complexes. Data mining was performed to obtain information on the frequencies of different types of atomic contacts. Docking simulations were run with the program DOCK. RESULTS: We compiled a large database of small ligand-protein complexes, enriched with different calculated properties, that currently contains more than 6000 non-redundant structures. As an example to demonstrate the value of the new database, we derived a new set of chemical matching rules to be used in the context of the program DOCK, based on contact frequencies between ligand atoms and points representing the protein surface, and proved their enhanced efficiency with respect to the default set of rules included in that program. CONCLUSION: The new database constitutes a valuable resource for the development of knowledge-based docking algorithms and for testing docking programs on large sets of protein-ligand complexes. The new chemical matching rules proposed in this work significantly increase the success rate in DOCKing simulations. The database developed in this work is available at . BioMed Central 2007-08-25 /pmc/articles/PMC2008766/ /pubmed/17718923 http://dx.doi.org/10.1186/1471-2105-8-310 Text en Copyright © 2007 Diago et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Diago, Luis A
Morell, Persy
Aguilera, Longendri
Moreno, Ernesto
Setting up a large set of protein-ligand PDB complexes for the development and validation of knowledge-based docking algorithms
title Setting up a large set of protein-ligand PDB complexes for the development and validation of knowledge-based docking algorithms
title_full Setting up a large set of protein-ligand PDB complexes for the development and validation of knowledge-based docking algorithms
title_fullStr Setting up a large set of protein-ligand PDB complexes for the development and validation of knowledge-based docking algorithms
title_full_unstemmed Setting up a large set of protein-ligand PDB complexes for the development and validation of knowledge-based docking algorithms
title_short Setting up a large set of protein-ligand PDB complexes for the development and validation of knowledge-based docking algorithms
title_sort setting up a large set of protein-ligand pdb complexes for the development and validation of knowledge-based docking algorithms
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2008766/
https://www.ncbi.nlm.nih.gov/pubmed/17718923
http://dx.doi.org/10.1186/1471-2105-8-310
work_keys_str_mv AT diagoluisa settingupalargesetofproteinligandpdbcomplexesforthedevelopmentandvalidationofknowledgebaseddockingalgorithms
AT morellpersy settingupalargesetofproteinligandpdbcomplexesforthedevelopmentandvalidationofknowledgebaseddockingalgorithms
AT aguileralongendri settingupalargesetofproteinligandpdbcomplexesforthedevelopmentandvalidationofknowledgebaseddockingalgorithms
AT morenoernesto settingupalargesetofproteinligandpdbcomplexesforthedevelopmentandvalidationofknowledgebaseddockingalgorithms