Cargando…

Protein contact order prediction from primary sequences

BACKGROUND: Contact order is a topological descriptor that has been shown to be correlated with several interesting protein properties such as protein folding rates and protein transition state placements. Contact order has also been used to select for viable protein folds from ab initio protein str...

Descripción completa

Detalles Bibliográficos
Autores principales: Shi, Yi, Zhou, Jianjun, Arndt, David, Wishart, David S, Lin, Guohui
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2440764/
https://www.ncbi.nlm.nih.gov/pubmed/18513429
http://dx.doi.org/10.1186/1471-2105-9-255
_version_ 1782156576153403392
author Shi, Yi
Zhou, Jianjun
Arndt, David
Wishart, David S
Lin, Guohui
author_facet Shi, Yi
Zhou, Jianjun
Arndt, David
Wishart, David S
Lin, Guohui
author_sort Shi, Yi
collection PubMed
description BACKGROUND: Contact order is a topological descriptor that has been shown to be correlated with several interesting protein properties such as protein folding rates and protein transition state placements. Contact order has also been used to select for viable protein folds from ab initio protein structure prediction programs. For proteins of known three-dimensional structure, their contact order can be calculated directly. However, for proteins with unknown three-dimensional structure, there is no effective prediction method currently available. RESULTS: In this paper, we propose several simple yet very effective methods to predict contact order from the amino acid sequence only. One set of methods is based on a weighted linear combination of predicted secondary structure content and amino acid composition. Depending on the number of components used in these equations it is possible to achieve a correlation coefficient of 0.857–0.870 between the observed and predicted contact order. A second method, based on sequence similarity to known three-dimensional structures, is able to achieve a correlation coefficient of 0.977. We have also developed a much more robust implementation for calculating contact order directly from PDB coordinates that works for > 99% PDB files. All of these contact order predictors and calculators have been implemented as a web server (see Availability and requirements section for URL). CONCLUSION: Protein contact order can be effectively predicted from the primary sequence, at the absence of three-dimensional structure. Three factors, percentage of residues in alpha helices, percentage of residues in beta strands, and sequence length, appear to be strongly correlated with the absolute contact order.
format Text
id pubmed-2440764
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-24407642008-06-27 Protein contact order prediction from primary sequences Shi, Yi Zhou, Jianjun Arndt, David Wishart, David S Lin, Guohui BMC Bioinformatics Research Article BACKGROUND: Contact order is a topological descriptor that has been shown to be correlated with several interesting protein properties such as protein folding rates and protein transition state placements. Contact order has also been used to select for viable protein folds from ab initio protein structure prediction programs. For proteins of known three-dimensional structure, their contact order can be calculated directly. However, for proteins with unknown three-dimensional structure, there is no effective prediction method currently available. RESULTS: In this paper, we propose several simple yet very effective methods to predict contact order from the amino acid sequence only. One set of methods is based on a weighted linear combination of predicted secondary structure content and amino acid composition. Depending on the number of components used in these equations it is possible to achieve a correlation coefficient of 0.857–0.870 between the observed and predicted contact order. A second method, based on sequence similarity to known three-dimensional structures, is able to achieve a correlation coefficient of 0.977. We have also developed a much more robust implementation for calculating contact order directly from PDB coordinates that works for > 99% PDB files. All of these contact order predictors and calculators have been implemented as a web server (see Availability and requirements section for URL). CONCLUSION: Protein contact order can be effectively predicted from the primary sequence, at the absence of three-dimensional structure. Three factors, percentage of residues in alpha helices, percentage of residues in beta strands, and sequence length, appear to be strongly correlated with the absolute contact order. BioMed Central 2008-05-30 /pmc/articles/PMC2440764/ /pubmed/18513429 http://dx.doi.org/10.1186/1471-2105-9-255 Text en Copyright © 2008 Shi et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Shi, Yi
Zhou, Jianjun
Arndt, David
Wishart, David S
Lin, Guohui
Protein contact order prediction from primary sequences
title Protein contact order prediction from primary sequences
title_full Protein contact order prediction from primary sequences
title_fullStr Protein contact order prediction from primary sequences
title_full_unstemmed Protein contact order prediction from primary sequences
title_short Protein contact order prediction from primary sequences
title_sort protein contact order prediction from primary sequences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2440764/
https://www.ncbi.nlm.nih.gov/pubmed/18513429
http://dx.doi.org/10.1186/1471-2105-9-255
work_keys_str_mv AT shiyi proteincontactorderpredictionfromprimarysequences
AT zhoujianjun proteincontactorderpredictionfromprimarysequences
AT arndtdavid proteincontactorderpredictionfromprimarysequences
AT wishartdavids proteincontactorderpredictionfromprimarysequences
AT linguohui proteincontactorderpredictionfromprimarysequences