Cargando…

Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms

Predicting the three dimensional native state structure of a protein from its primary sequence is an unsolved grand challenge in molecular biology. Two main computational approaches have evolved to obtain the structure from the protein sequence - ab initio/de novo methods and template-based modeling...

Descripción completa

Detalles Bibliográficos
Autores principales: Chakraborty, Sandeep, Venkatramani, Ravindra, Rao, Basuthkar J., Asgeirsson, Bjarni, Dandekar, Abhaya M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000Research 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3892923/
https://www.ncbi.nlm.nih.gov/pubmed/24555103
http://dx.doi.org/10.12688/f1000research.2-211.v3
_version_ 1782299607832723456
author Chakraborty, Sandeep
Venkatramani, Ravindra
Rao, Basuthkar J.
Asgeirsson, Bjarni
Dandekar, Abhaya M.
author_facet Chakraborty, Sandeep
Venkatramani, Ravindra
Rao, Basuthkar J.
Asgeirsson, Bjarni
Dandekar, Abhaya M.
author_sort Chakraborty, Sandeep
collection PubMed
description Predicting the three dimensional native state structure of a protein from its primary sequence is an unsolved grand challenge in molecular biology. Two main computational approaches have evolved to obtain the structure from the protein sequence - ab initio/de novo methods and template-based modeling - both of which typically generate multiple possible native state structures. Model quality assessment programs (MQAP) validate these predicted structures in order to identify the correct native state structure. Here, we propose a MQAP for assessing the quality of protein structures based on the distances of consecutive Cα atoms. We hypothesize that the root-mean-square deviation of the distance of consecutive Cα (RDCC) atoms from the ideal value of 3.8 Å, derived from a statistical analysis of high quality protein structures (top100H database), is minimized in native structures. Based on tests with the top100H set, we propose a RDCC cutoff value of 0.012 Å, above which a structure can be filtered out as a non-native structure. We applied the RDCC discriminator on decoy sets from the Decoys 'R' Us database to show that the native structures in all decoy sets tested have RDCC below the 0.012 Å cutoff. While most decoy sets were either indistinguishable using this discriminator or had very few violations, all the decoy structures in the fisa decoy set were discriminated by applying the RDCC criterion. This highlights the physical non-viability of the fisa decoy set, and possible issues in benchmarking other methods using this set. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134.
format Online
Article
Text
id pubmed-3892923
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher F1000Research
record_format MEDLINE/PubMed
spelling pubmed-38929232014-01-29 Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms Chakraborty, Sandeep Venkatramani, Ravindra Rao, Basuthkar J. Asgeirsson, Bjarni Dandekar, Abhaya M. F1000Res Short Research Article Predicting the three dimensional native state structure of a protein from its primary sequence is an unsolved grand challenge in molecular biology. Two main computational approaches have evolved to obtain the structure from the protein sequence - ab initio/de novo methods and template-based modeling - both of which typically generate multiple possible native state structures. Model quality assessment programs (MQAP) validate these predicted structures in order to identify the correct native state structure. Here, we propose a MQAP for assessing the quality of protein structures based on the distances of consecutive Cα atoms. We hypothesize that the root-mean-square deviation of the distance of consecutive Cα (RDCC) atoms from the ideal value of 3.8 Å, derived from a statistical analysis of high quality protein structures (top100H database), is minimized in native structures. Based on tests with the top100H set, we propose a RDCC cutoff value of 0.012 Å, above which a structure can be filtered out as a non-native structure. We applied the RDCC discriminator on decoy sets from the Decoys 'R' Us database to show that the native structures in all decoy sets tested have RDCC below the 0.012 Å cutoff. While most decoy sets were either indistinguishable using this discriminator or had very few violations, all the decoy structures in the fisa decoy set were discriminated by applying the RDCC criterion. This highlights the physical non-viability of the fisa decoy set, and possible issues in benchmarking other methods using this set. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134. F1000Research 2013-12-17 /pmc/articles/PMC3892923/ /pubmed/24555103 http://dx.doi.org/10.12688/f1000research.2-211.v3 Text en Copyright: © 2013 Chakraborty S et al. http://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. http://creativecommons.org/publicdomain/zero/1.0/ Data associated with the article are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).
spellingShingle Short Research Article
Chakraborty, Sandeep
Venkatramani, Ravindra
Rao, Basuthkar J.
Asgeirsson, Bjarni
Dandekar, Abhaya M.
Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms
title Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms
title_full Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms
title_fullStr Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms
title_full_unstemmed Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms
title_short Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms
title_sort protein structure quality assessment based on the distance profiles of consecutive backbone cα atoms
topic Short Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3892923/
https://www.ncbi.nlm.nih.gov/pubmed/24555103
http://dx.doi.org/10.12688/f1000research.2-211.v3
work_keys_str_mv AT chakrabortysandeep proteinstructurequalityassessmentbasedonthedistanceprofilesofconsecutivebackbonecaatoms
AT venkatramaniravindra proteinstructurequalityassessmentbasedonthedistanceprofilesofconsecutivebackbonecaatoms
AT raobasuthkarj proteinstructurequalityassessmentbasedonthedistanceprofilesofconsecutivebackbonecaatoms
AT asgeirssonbjarni proteinstructurequalityassessmentbasedonthedistanceprofilesofconsecutivebackbonecaatoms
AT dandekarabhayam proteinstructurequalityassessmentbasedonthedistanceprofilesofconsecutivebackbonecaatoms