Cargando…

Selective prediction of interaction sites in protein structures with THEMATICS

BACKGROUND: Methods are now available for the prediction of interaction sites in protein 3D structures. While many of these methods report high success rates for site prediction, often these predictions are not very selective and have low precision. Precision in site prediction is addressed using Th...

Descripción completa

Detalles Bibliográficos
Autores principales: Wei, Ying, Ko, Jaeju, Murga, Leonel F, Ondrechen, Mary Jo
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1877815/
https://www.ncbi.nlm.nih.gov/pubmed/17419878
http://dx.doi.org/10.1186/1471-2105-8-119
_version_ 1782133567071977472
author Wei, Ying
Ko, Jaeju
Murga, Leonel F
Ondrechen, Mary Jo
author_facet Wei, Ying
Ko, Jaeju
Murga, Leonel F
Ondrechen, Mary Jo
author_sort Wei, Ying
collection PubMed
description BACKGROUND: Methods are now available for the prediction of interaction sites in protein 3D structures. While many of these methods report high success rates for site prediction, often these predictions are not very selective and have low precision. Precision in site prediction is addressed using Theoretical Microscopic Titration Curves (THEMATICS), a simple computational method for the identification of active sites in enzymes. Recall and precision are measured and compared with other methods for the prediction of catalytic sites. RESULTS: Using a test set of 169 enzymes from the original Catalytic Residue Dataset (CatRes) it is shown that THEMATICS can deliver precise, localised site predictions. Furthermore, adjustment of the cut-off criteria can improve the recall rates for catalytic residues with only a small sacrifice in precision. Recall rates for CatRes/CSA annotated catalytic residues are 41.1%, 50.4%, and 54.2% for Z score cut-off values of 1.00, 0.99, and 0.98, respectively. The corresponding precision rates are 19.4%, 17.9%, and 16.4%. The success rate for catalytic sites is higher, with correct or partially correct predictions for 77.5%, 85.8%, and 88.2% of the enzymes in the test set, corresponding to the same respective Z score cut-offs, if only the CatRes annotations are used as the reference set. Incorporation of additional literature annotations into the reference set gives total success rates of 89.9%, 92.9%, and 94.1%, again for corresponding cut-off values of 1.00, 0.99, and 0.98. False positive rates for a 75-protein test set are 1.95%, 2.60%, and 3.12% for Z score cut-offs of 1.00, 0.99, and 0.98, respectively. CONCLUSION: With a preferred cut-off value of 0.99, THEMATICS achieves a high success rate of interaction site prediction, about 86% correct or partially correct using CatRes/CSA annotations only and about 93% with an expanded reference set. Success rates for catalytic residue prediction are similar to those of other structure-based methods, but with substantially better precision and lower false positive rates. THEMATICS performs well across the spectrum of E.C. classes. The method requires only the structure of the query protein as input. THEMATICS predictions may be obtained via the web from structures in PDB format at:
format Text
id pubmed-1877815
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18778152007-05-25 Selective prediction of interaction sites in protein structures with THEMATICS Wei, Ying Ko, Jaeju Murga, Leonel F Ondrechen, Mary Jo BMC Bioinformatics Research Article BACKGROUND: Methods are now available for the prediction of interaction sites in protein 3D structures. While many of these methods report high success rates for site prediction, often these predictions are not very selective and have low precision. Precision in site prediction is addressed using Theoretical Microscopic Titration Curves (THEMATICS), a simple computational method for the identification of active sites in enzymes. Recall and precision are measured and compared with other methods for the prediction of catalytic sites. RESULTS: Using a test set of 169 enzymes from the original Catalytic Residue Dataset (CatRes) it is shown that THEMATICS can deliver precise, localised site predictions. Furthermore, adjustment of the cut-off criteria can improve the recall rates for catalytic residues with only a small sacrifice in precision. Recall rates for CatRes/CSA annotated catalytic residues are 41.1%, 50.4%, and 54.2% for Z score cut-off values of 1.00, 0.99, and 0.98, respectively. The corresponding precision rates are 19.4%, 17.9%, and 16.4%. The success rate for catalytic sites is higher, with correct or partially correct predictions for 77.5%, 85.8%, and 88.2% of the enzymes in the test set, corresponding to the same respective Z score cut-offs, if only the CatRes annotations are used as the reference set. Incorporation of additional literature annotations into the reference set gives total success rates of 89.9%, 92.9%, and 94.1%, again for corresponding cut-off values of 1.00, 0.99, and 0.98. False positive rates for a 75-protein test set are 1.95%, 2.60%, and 3.12% for Z score cut-offs of 1.00, 0.99, and 0.98, respectively. CONCLUSION: With a preferred cut-off value of 0.99, THEMATICS achieves a high success rate of interaction site prediction, about 86% correct or partially correct using CatRes/CSA annotations only and about 93% with an expanded reference set. Success rates for catalytic residue prediction are similar to those of other structure-based methods, but with substantially better precision and lower false positive rates. THEMATICS performs well across the spectrum of E.C. classes. The method requires only the structure of the query protein as input. THEMATICS predictions may be obtained via the web from structures in PDB format at: BioMed Central 2007-04-09 /pmc/articles/PMC1877815/ /pubmed/17419878 http://dx.doi.org/10.1186/1471-2105-8-119 Text en Copyright © 2007 Wei et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Wei, Ying
Ko, Jaeju
Murga, Leonel F
Ondrechen, Mary Jo
Selective prediction of interaction sites in protein structures with THEMATICS
title Selective prediction of interaction sites in protein structures with THEMATICS
title_full Selective prediction of interaction sites in protein structures with THEMATICS
title_fullStr Selective prediction of interaction sites in protein structures with THEMATICS
title_full_unstemmed Selective prediction of interaction sites in protein structures with THEMATICS
title_short Selective prediction of interaction sites in protein structures with THEMATICS
title_sort selective prediction of interaction sites in protein structures with thematics
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1877815/
https://www.ncbi.nlm.nih.gov/pubmed/17419878
http://dx.doi.org/10.1186/1471-2105-8-119
work_keys_str_mv AT weiying selectivepredictionofinteractionsitesinproteinstructureswiththematics
AT kojaeju selectivepredictionofinteractionsitesinproteinstructureswiththematics
AT murgaleonelf selectivepredictionofinteractionsitesinproteinstructureswiththematics
AT ondrechenmaryjo selectivepredictionofinteractionsitesinproteinstructureswiththematics