Cargando…
Selective prediction of interaction sites in protein structures with THEMATICS
BACKGROUND: Methods are now available for the prediction of interaction sites in protein 3D structures. While many of these methods report high success rates for site prediction, often these predictions are not very selective and have low precision. Precision in site prediction is addressed using Th...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1877815/ https://www.ncbi.nlm.nih.gov/pubmed/17419878 http://dx.doi.org/10.1186/1471-2105-8-119 |
_version_ | 1782133567071977472 |
---|---|
author | Wei, Ying Ko, Jaeju Murga, Leonel F Ondrechen, Mary Jo |
author_facet | Wei, Ying Ko, Jaeju Murga, Leonel F Ondrechen, Mary Jo |
author_sort | Wei, Ying |
collection | PubMed |
description | BACKGROUND: Methods are now available for the prediction of interaction sites in protein 3D structures. While many of these methods report high success rates for site prediction, often these predictions are not very selective and have low precision. Precision in site prediction is addressed using Theoretical Microscopic Titration Curves (THEMATICS), a simple computational method for the identification of active sites in enzymes. Recall and precision are measured and compared with other methods for the prediction of catalytic sites. RESULTS: Using a test set of 169 enzymes from the original Catalytic Residue Dataset (CatRes) it is shown that THEMATICS can deliver precise, localised site predictions. Furthermore, adjustment of the cut-off criteria can improve the recall rates for catalytic residues with only a small sacrifice in precision. Recall rates for CatRes/CSA annotated catalytic residues are 41.1%, 50.4%, and 54.2% for Z score cut-off values of 1.00, 0.99, and 0.98, respectively. The corresponding precision rates are 19.4%, 17.9%, and 16.4%. The success rate for catalytic sites is higher, with correct or partially correct predictions for 77.5%, 85.8%, and 88.2% of the enzymes in the test set, corresponding to the same respective Z score cut-offs, if only the CatRes annotations are used as the reference set. Incorporation of additional literature annotations into the reference set gives total success rates of 89.9%, 92.9%, and 94.1%, again for corresponding cut-off values of 1.00, 0.99, and 0.98. False positive rates for a 75-protein test set are 1.95%, 2.60%, and 3.12% for Z score cut-offs of 1.00, 0.99, and 0.98, respectively. CONCLUSION: With a preferred cut-off value of 0.99, THEMATICS achieves a high success rate of interaction site prediction, about 86% correct or partially correct using CatRes/CSA annotations only and about 93% with an expanded reference set. Success rates for catalytic residue prediction are similar to those of other structure-based methods, but with substantially better precision and lower false positive rates. THEMATICS performs well across the spectrum of E.C. classes. The method requires only the structure of the query protein as input. THEMATICS predictions may be obtained via the web from structures in PDB format at: |
format | Text |
id | pubmed-1877815 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-18778152007-05-25 Selective prediction of interaction sites in protein structures with THEMATICS Wei, Ying Ko, Jaeju Murga, Leonel F Ondrechen, Mary Jo BMC Bioinformatics Research Article BACKGROUND: Methods are now available for the prediction of interaction sites in protein 3D structures. While many of these methods report high success rates for site prediction, often these predictions are not very selective and have low precision. Precision in site prediction is addressed using Theoretical Microscopic Titration Curves (THEMATICS), a simple computational method for the identification of active sites in enzymes. Recall and precision are measured and compared with other methods for the prediction of catalytic sites. RESULTS: Using a test set of 169 enzymes from the original Catalytic Residue Dataset (CatRes) it is shown that THEMATICS can deliver precise, localised site predictions. Furthermore, adjustment of the cut-off criteria can improve the recall rates for catalytic residues with only a small sacrifice in precision. Recall rates for CatRes/CSA annotated catalytic residues are 41.1%, 50.4%, and 54.2% for Z score cut-off values of 1.00, 0.99, and 0.98, respectively. The corresponding precision rates are 19.4%, 17.9%, and 16.4%. The success rate for catalytic sites is higher, with correct or partially correct predictions for 77.5%, 85.8%, and 88.2% of the enzymes in the test set, corresponding to the same respective Z score cut-offs, if only the CatRes annotations are used as the reference set. Incorporation of additional literature annotations into the reference set gives total success rates of 89.9%, 92.9%, and 94.1%, again for corresponding cut-off values of 1.00, 0.99, and 0.98. False positive rates for a 75-protein test set are 1.95%, 2.60%, and 3.12% for Z score cut-offs of 1.00, 0.99, and 0.98, respectively. CONCLUSION: With a preferred cut-off value of 0.99, THEMATICS achieves a high success rate of interaction site prediction, about 86% correct or partially correct using CatRes/CSA annotations only and about 93% with an expanded reference set. Success rates for catalytic residue prediction are similar to those of other structure-based methods, but with substantially better precision and lower false positive rates. THEMATICS performs well across the spectrum of E.C. classes. The method requires only the structure of the query protein as input. THEMATICS predictions may be obtained via the web from structures in PDB format at: BioMed Central 2007-04-09 /pmc/articles/PMC1877815/ /pubmed/17419878 http://dx.doi.org/10.1186/1471-2105-8-119 Text en Copyright © 2007 Wei et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Wei, Ying Ko, Jaeju Murga, Leonel F Ondrechen, Mary Jo Selective prediction of interaction sites in protein structures with THEMATICS |
title | Selective prediction of interaction sites in protein structures with THEMATICS |
title_full | Selective prediction of interaction sites in protein structures with THEMATICS |
title_fullStr | Selective prediction of interaction sites in protein structures with THEMATICS |
title_full_unstemmed | Selective prediction of interaction sites in protein structures with THEMATICS |
title_short | Selective prediction of interaction sites in protein structures with THEMATICS |
title_sort | selective prediction of interaction sites in protein structures with thematics |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1877815/ https://www.ncbi.nlm.nih.gov/pubmed/17419878 http://dx.doi.org/10.1186/1471-2105-8-119 |
work_keys_str_mv | AT weiying selectivepredictionofinteractionsitesinproteinstructureswiththematics AT kojaeju selectivepredictionofinteractionsitesinproteinstructureswiththematics AT murgaleonelf selectivepredictionofinteractionsitesinproteinstructureswiththematics AT ondrechenmaryjo selectivepredictionofinteractionsitesinproteinstructureswiththematics |