Cargando…
CRYSTALP2: sequence-based protein crystallization propensity prediction
BACKGROUND: Current protocols yield crystals for <30% of known proteins, indicating that automatically identifying crystallizable proteins may improve high-throughput structural genomics efforts. We introduce CRYSTALP2, a kernel-based method that predicts the propensity of a given protein sequenc...
Autores principales: | , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2731098/ https://www.ncbi.nlm.nih.gov/pubmed/19646256 http://dx.doi.org/10.1186/1472-6807-9-50 |
_version_ | 1782170939347173376 |
---|---|
author | Kurgan, Lukasz Razib, Ali A Aghakhani, Sara Dick, Scott Mizianty, Marcin Jahandideh, Samad |
author_facet | Kurgan, Lukasz Razib, Ali A Aghakhani, Sara Dick, Scott Mizianty, Marcin Jahandideh, Samad |
author_sort | Kurgan, Lukasz |
collection | PubMed |
description | BACKGROUND: Current protocols yield crystals for <30% of known proteins, indicating that automatically identifying crystallizable proteins may improve high-throughput structural genomics efforts. We introduce CRYSTALP2, a kernel-based method that predicts the propensity of a given protein sequence to produce diffraction-quality crystals. This method utilizes the composition and collocation of amino acids, isoelectric point, and hydrophobicity, as estimated from the primary sequence, to generate predictions. CRYSTALP2 extends its predecessor, CRYSTALP, by enabling predictions for sequences of unrestricted size and provides improved prediction quality. RESULTS: A significant majority of the collocations used by CRYSTALP2 include residues with high conformational entropy, or low entropy and high potential to mediate crystal contacts; notably, such residues are utilized by surface entropy reduction methods. We show that the collocations provide complementary information to the hydrophobicity and isoelectric point. Tests on four datasets show that CRYSTALP2 outperforms several existing sequence-based predictors (CRYSTALP, OB-score, and SECRET). CRYSTALP2's accuracy, MCC, and AROC range between 69.3 and 77.5%, 0.39 and 0.55, and 0.72 and 0.79, respectively. Our predictions are similar in quality and are complementary to the predictions of the most recent ParCrys and XtalPred methods. Our results also suggest that, as work in protein crystallization continues (thereby enlarging the population of proteins with known crystallization propensities), the prediction quality of the CRYSTALP2 method should increase. The prediction model and the datasets used in this contribution can be downloaded from . CONCLUSION: CRYSTALP2 provides relatively accurate crystallization propensity predictions for a given protein chain that either outperform or complement the existing approaches. The proposed method can be used to support current efforts towards improving the success rate in obtaining diffraction-quality crystals. |
format | Text |
id | pubmed-2731098 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-27310982009-08-24 CRYSTALP2: sequence-based protein crystallization propensity prediction Kurgan, Lukasz Razib, Ali A Aghakhani, Sara Dick, Scott Mizianty, Marcin Jahandideh, Samad BMC Struct Biol Methodology Article BACKGROUND: Current protocols yield crystals for <30% of known proteins, indicating that automatically identifying crystallizable proteins may improve high-throughput structural genomics efforts. We introduce CRYSTALP2, a kernel-based method that predicts the propensity of a given protein sequence to produce diffraction-quality crystals. This method utilizes the composition and collocation of amino acids, isoelectric point, and hydrophobicity, as estimated from the primary sequence, to generate predictions. CRYSTALP2 extends its predecessor, CRYSTALP, by enabling predictions for sequences of unrestricted size and provides improved prediction quality. RESULTS: A significant majority of the collocations used by CRYSTALP2 include residues with high conformational entropy, or low entropy and high potential to mediate crystal contacts; notably, such residues are utilized by surface entropy reduction methods. We show that the collocations provide complementary information to the hydrophobicity and isoelectric point. Tests on four datasets show that CRYSTALP2 outperforms several existing sequence-based predictors (CRYSTALP, OB-score, and SECRET). CRYSTALP2's accuracy, MCC, and AROC range between 69.3 and 77.5%, 0.39 and 0.55, and 0.72 and 0.79, respectively. Our predictions are similar in quality and are complementary to the predictions of the most recent ParCrys and XtalPred methods. Our results also suggest that, as work in protein crystallization continues (thereby enlarging the population of proteins with known crystallization propensities), the prediction quality of the CRYSTALP2 method should increase. The prediction model and the datasets used in this contribution can be downloaded from . CONCLUSION: CRYSTALP2 provides relatively accurate crystallization propensity predictions for a given protein chain that either outperform or complement the existing approaches. The proposed method can be used to support current efforts towards improving the success rate in obtaining diffraction-quality crystals. BioMed Central 2009-07-31 /pmc/articles/PMC2731098/ /pubmed/19646256 http://dx.doi.org/10.1186/1472-6807-9-50 Text en Copyright © 2009 Kurgan et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Kurgan, Lukasz Razib, Ali A Aghakhani, Sara Dick, Scott Mizianty, Marcin Jahandideh, Samad CRYSTALP2: sequence-based protein crystallization propensity prediction |
title | CRYSTALP2: sequence-based protein crystallization propensity prediction |
title_full | CRYSTALP2: sequence-based protein crystallization propensity prediction |
title_fullStr | CRYSTALP2: sequence-based protein crystallization propensity prediction |
title_full_unstemmed | CRYSTALP2: sequence-based protein crystallization propensity prediction |
title_short | CRYSTALP2: sequence-based protein crystallization propensity prediction |
title_sort | crystalp2: sequence-based protein crystallization propensity prediction |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2731098/ https://www.ncbi.nlm.nih.gov/pubmed/19646256 http://dx.doi.org/10.1186/1472-6807-9-50 |
work_keys_str_mv | AT kurganlukasz crystalp2sequencebasedproteincrystallizationpropensityprediction AT razibalia crystalp2sequencebasedproteincrystallizationpropensityprediction AT aghakhanisara crystalp2sequencebasedproteincrystallizationpropensityprediction AT dickscott crystalp2sequencebasedproteincrystallizationpropensityprediction AT miziantymarcin crystalp2sequencebasedproteincrystallizationpropensityprediction AT jahandidehsamad crystalp2sequencebasedproteincrystallizationpropensityprediction |