Cargando…
SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines
BACKGROUND: It is important to predict the quality of a protein structural model before its native structure is known. The method that can predict the absolute local quality of individual residues in a single protein model is rare, yet particularly needed for using, ranking and refining protein mode...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4013430/ https://www.ncbi.nlm.nih.gov/pubmed/24776231 http://dx.doi.org/10.1186/1471-2105-15-120 |
_version_ | 1782315049733324800 |
---|---|
author | Cao, Renzhi Wang, Zheng Wang, Yiheng Cheng, Jianlin |
author_facet | Cao, Renzhi Wang, Zheng Wang, Yiheng Cheng, Jianlin |
author_sort | Cao, Renzhi |
collection | PubMed |
description | BACKGROUND: It is important to predict the quality of a protein structural model before its native structure is known. The method that can predict the absolute local quality of individual residues in a single protein model is rare, yet particularly needed for using, ranking and refining protein models. RESULTS: We developed a machine learning tool (SMOQ) that can predict the distance deviation of each residue in a single protein model. SMOQ uses support vector machines (SVM) with protein sequence and structural features (i.e. basic feature set), including amino acid sequence, secondary structures, solvent accessibilities, and residue-residue contacts to make predictions. We also trained a SVM model with two new additional features (profiles and SOV scores) on 20 CASP8 targets and found that including them can only improve the performance when real deviations between native and model are higher than 5Å. The SMOQ tool finally released uses the basic feature set trained on 85 CASP8 targets. Moreover, SMOQ implemented a way to convert predicted local quality scores into a global quality score. SMOQ was tested on the 84 CASP9 single-domain targets. The average difference between the residue-specific distance deviation predicted by our method and the actual distance deviation on the test data is 2.637Å. The global quality prediction accuracy of the tool is comparable to other good tools on the same benchmark. CONCLUSION: SMOQ is a useful tool for protein single model quality assessment. Its source code and executable are available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/. |
format | Online Article Text |
id | pubmed-4013430 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-40134302014-05-22 SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines Cao, Renzhi Wang, Zheng Wang, Yiheng Cheng, Jianlin BMC Bioinformatics Software BACKGROUND: It is important to predict the quality of a protein structural model before its native structure is known. The method that can predict the absolute local quality of individual residues in a single protein model is rare, yet particularly needed for using, ranking and refining protein models. RESULTS: We developed a machine learning tool (SMOQ) that can predict the distance deviation of each residue in a single protein model. SMOQ uses support vector machines (SVM) with protein sequence and structural features (i.e. basic feature set), including amino acid sequence, secondary structures, solvent accessibilities, and residue-residue contacts to make predictions. We also trained a SVM model with two new additional features (profiles and SOV scores) on 20 CASP8 targets and found that including them can only improve the performance when real deviations between native and model are higher than 5Å. The SMOQ tool finally released uses the basic feature set trained on 85 CASP8 targets. Moreover, SMOQ implemented a way to convert predicted local quality scores into a global quality score. SMOQ was tested on the 84 CASP9 single-domain targets. The average difference between the residue-specific distance deviation predicted by our method and the actual distance deviation on the test data is 2.637Å. The global quality prediction accuracy of the tool is comparable to other good tools on the same benchmark. CONCLUSION: SMOQ is a useful tool for protein single model quality assessment. Its source code and executable are available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/. BioMed Central 2014-04-28 /pmc/articles/PMC4013430/ /pubmed/24776231 http://dx.doi.org/10.1186/1471-2105-15-120 Text en Copyright © 2014 Cao et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Cao, Renzhi Wang, Zheng Wang, Yiheng Cheng, Jianlin SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines |
title | SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines |
title_full | SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines |
title_fullStr | SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines |
title_full_unstemmed | SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines |
title_short | SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines |
title_sort | smoq: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4013430/ https://www.ncbi.nlm.nih.gov/pubmed/24776231 http://dx.doi.org/10.1186/1471-2105-15-120 |
work_keys_str_mv | AT caorenzhi smoqatoolforpredictingtheabsoluteresiduespecificqualityofasingleproteinmodelwithsupportvectormachines AT wangzheng smoqatoolforpredictingtheabsoluteresiduespecificqualityofasingleproteinmodelwithsupportvectormachines AT wangyiheng smoqatoolforpredictingtheabsoluteresiduespecificqualityofasingleproteinmodelwithsupportvectormachines AT chengjianlin smoqatoolforpredictingtheabsoluteresiduespecificqualityofasingleproteinmodelwithsupportvectormachines |