Cargando…

De Novo Structure Prediction of Globular Proteins Aided by Sequence Variation-Derived Contacts

The advent of high accuracy residue-residue intra-protein contact prediction methods enabled a significant boost in the quality of de novo structure predictions. Here, we investigate the potential benefits of combining a well-established fragment-based folding algorithm – FRAGFOLD, with PSICOV, a co...

Descripción completa

Detalles Bibliográficos
Autores principales: Kosciolek, Tomasz, Jones, David T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3956894/
https://www.ncbi.nlm.nih.gov/pubmed/24637808
http://dx.doi.org/10.1371/journal.pone.0092197
_version_ 1782307729906335744
author Kosciolek, Tomasz
Jones, David T.
author_facet Kosciolek, Tomasz
Jones, David T.
author_sort Kosciolek, Tomasz
collection PubMed
description The advent of high accuracy residue-residue intra-protein contact prediction methods enabled a significant boost in the quality of de novo structure predictions. Here, we investigate the potential benefits of combining a well-established fragment-based folding algorithm – FRAGFOLD, with PSICOV, a contact prediction method which uses sparse inverse covariance estimation to identify co-varying sites in multiple sequence alignments. Using a comprehensive set of 150 diverse globular target proteins, up to 266 amino acids in length, we are able to address the effectiveness and some limitations of such approaches to globular proteins in practice. Overall we find that using fragment assembly with both statistical potentials and predicted contacts is significantly better than either statistical potentials or contacts alone. Results show up to nearly 80% of correct predictions (TM-score ≥0.5) within analysed dataset and a mean TM-score of 0.54. Unsuccessful modelling cases emerged either from conformational sampling problems, or insufficient contact prediction accuracy. Nevertheless, a strong dependency of the quality of final models on the fraction of satisfied predicted long-range contacts was observed. This not only highlights the importance of these contacts on determining the protein fold, but also (combined with other ensemble-derived qualities) provides a powerful guide as to the choice of correct models and the global quality of the selected model. A proposed quality assessment scoring function achieves 0.93 precision and 0.77 recall for the discrimination of correct folds on our dataset of decoys. These findings suggest the approach is well-suited for blind predictions on a variety of globular proteins of unknown 3D structure, provided that enough homologous sequences are available to construct a large and accurate multiple sequence alignment for the initial contact prediction step.
format Online
Article
Text
id pubmed-3956894
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-39568942014-03-18 De Novo Structure Prediction of Globular Proteins Aided by Sequence Variation-Derived Contacts Kosciolek, Tomasz Jones, David T. PLoS One Research Article The advent of high accuracy residue-residue intra-protein contact prediction methods enabled a significant boost in the quality of de novo structure predictions. Here, we investigate the potential benefits of combining a well-established fragment-based folding algorithm – FRAGFOLD, with PSICOV, a contact prediction method which uses sparse inverse covariance estimation to identify co-varying sites in multiple sequence alignments. Using a comprehensive set of 150 diverse globular target proteins, up to 266 amino acids in length, we are able to address the effectiveness and some limitations of such approaches to globular proteins in practice. Overall we find that using fragment assembly with both statistical potentials and predicted contacts is significantly better than either statistical potentials or contacts alone. Results show up to nearly 80% of correct predictions (TM-score ≥0.5) within analysed dataset and a mean TM-score of 0.54. Unsuccessful modelling cases emerged either from conformational sampling problems, or insufficient contact prediction accuracy. Nevertheless, a strong dependency of the quality of final models on the fraction of satisfied predicted long-range contacts was observed. This not only highlights the importance of these contacts on determining the protein fold, but also (combined with other ensemble-derived qualities) provides a powerful guide as to the choice of correct models and the global quality of the selected model. A proposed quality assessment scoring function achieves 0.93 precision and 0.77 recall for the discrimination of correct folds on our dataset of decoys. These findings suggest the approach is well-suited for blind predictions on a variety of globular proteins of unknown 3D structure, provided that enough homologous sequences are available to construct a large and accurate multiple sequence alignment for the initial contact prediction step. Public Library of Science 2014-03-17 /pmc/articles/PMC3956894/ /pubmed/24637808 http://dx.doi.org/10.1371/journal.pone.0092197 Text en © 2014 Kosciolek, Jones http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Kosciolek, Tomasz
Jones, David T.
De Novo Structure Prediction of Globular Proteins Aided by Sequence Variation-Derived Contacts
title De Novo Structure Prediction of Globular Proteins Aided by Sequence Variation-Derived Contacts
title_full De Novo Structure Prediction of Globular Proteins Aided by Sequence Variation-Derived Contacts
title_fullStr De Novo Structure Prediction of Globular Proteins Aided by Sequence Variation-Derived Contacts
title_full_unstemmed De Novo Structure Prediction of Globular Proteins Aided by Sequence Variation-Derived Contacts
title_short De Novo Structure Prediction of Globular Proteins Aided by Sequence Variation-Derived Contacts
title_sort de novo structure prediction of globular proteins aided by sequence variation-derived contacts
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3956894/
https://www.ncbi.nlm.nih.gov/pubmed/24637808
http://dx.doi.org/10.1371/journal.pone.0092197
work_keys_str_mv AT kosciolektomasz denovostructurepredictionofglobularproteinsaidedbysequencevariationderivedcontacts
AT jonesdavidt denovostructurepredictionofglobularproteinsaidedbysequencevariationderivedcontacts