Cargando…

SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines

BACKGROUND: It is important to predict the quality of a protein structural model before its native structure is known. The method that can predict the absolute local quality of individual residues in a single protein model is rare, yet particularly needed for using, ranking and refining protein mode...

Descripción completa

Detalles Bibliográficos
Autores principales: Cao, Renzhi, Wang, Zheng, Wang, Yiheng, Cheng, Jianlin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4013430/
https://www.ncbi.nlm.nih.gov/pubmed/24776231
http://dx.doi.org/10.1186/1471-2105-15-120
_version_ 1782315049733324800
author Cao, Renzhi
Wang, Zheng
Wang, Yiheng
Cheng, Jianlin
author_facet Cao, Renzhi
Wang, Zheng
Wang, Yiheng
Cheng, Jianlin
author_sort Cao, Renzhi
collection PubMed
description BACKGROUND: It is important to predict the quality of a protein structural model before its native structure is known. The method that can predict the absolute local quality of individual residues in a single protein model is rare, yet particularly needed for using, ranking and refining protein models. RESULTS: We developed a machine learning tool (SMOQ) that can predict the distance deviation of each residue in a single protein model. SMOQ uses support vector machines (SVM) with protein sequence and structural features (i.e. basic feature set), including amino acid sequence, secondary structures, solvent accessibilities, and residue-residue contacts to make predictions. We also trained a SVM model with two new additional features (profiles and SOV scores) on 20 CASP8 targets and found that including them can only improve the performance when real deviations between native and model are higher than 5Å. The SMOQ tool finally released uses the basic feature set trained on 85 CASP8 targets. Moreover, SMOQ implemented a way to convert predicted local quality scores into a global quality score. SMOQ was tested on the 84 CASP9 single-domain targets. The average difference between the residue-specific distance deviation predicted by our method and the actual distance deviation on the test data is 2.637Å. The global quality prediction accuracy of the tool is comparable to other good tools on the same benchmark. CONCLUSION: SMOQ is a useful tool for protein single model quality assessment. Its source code and executable are available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/.
format Online
Article
Text
id pubmed-4013430
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40134302014-05-22 SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines Cao, Renzhi Wang, Zheng Wang, Yiheng Cheng, Jianlin BMC Bioinformatics Software BACKGROUND: It is important to predict the quality of a protein structural model before its native structure is known. The method that can predict the absolute local quality of individual residues in a single protein model is rare, yet particularly needed for using, ranking and refining protein models. RESULTS: We developed a machine learning tool (SMOQ) that can predict the distance deviation of each residue in a single protein model. SMOQ uses support vector machines (SVM) with protein sequence and structural features (i.e. basic feature set), including amino acid sequence, secondary structures, solvent accessibilities, and residue-residue contacts to make predictions. We also trained a SVM model with two new additional features (profiles and SOV scores) on 20 CASP8 targets and found that including them can only improve the performance when real deviations between native and model are higher than 5Å. The SMOQ tool finally released uses the basic feature set trained on 85 CASP8 targets. Moreover, SMOQ implemented a way to convert predicted local quality scores into a global quality score. SMOQ was tested on the 84 CASP9 single-domain targets. The average difference between the residue-specific distance deviation predicted by our method and the actual distance deviation on the test data is 2.637Å. The global quality prediction accuracy of the tool is comparable to other good tools on the same benchmark. CONCLUSION: SMOQ is a useful tool for protein single model quality assessment. Its source code and executable are available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/. BioMed Central 2014-04-28 /pmc/articles/PMC4013430/ /pubmed/24776231 http://dx.doi.org/10.1186/1471-2105-15-120 Text en Copyright © 2014 Cao et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Cao, Renzhi
Wang, Zheng
Wang, Yiheng
Cheng, Jianlin
SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines
title SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines
title_full SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines
title_fullStr SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines
title_full_unstemmed SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines
title_short SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines
title_sort smoq: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4013430/
https://www.ncbi.nlm.nih.gov/pubmed/24776231
http://dx.doi.org/10.1186/1471-2105-15-120
work_keys_str_mv AT caorenzhi smoqatoolforpredictingtheabsoluteresiduespecificqualityofasingleproteinmodelwithsupportvectormachines
AT wangzheng smoqatoolforpredictingtheabsoluteresiduespecificqualityofasingleproteinmodelwithsupportvectormachines
AT wangyiheng smoqatoolforpredictingtheabsoluteresiduespecificqualityofasingleproteinmodelwithsupportvectormachines
AT chengjianlin smoqatoolforpredictingtheabsoluteresiduespecificqualityofasingleproteinmodelwithsupportvectormachines