Cargando…

CRF-based models of protein surfaces improve protein-protein interaction site predictions

BACKGROUND: The identification of protein-protein interaction sites is a computationally challenging task and important for understanding the biology of protein complexes. There is a rich literature in this field. A broad class of approaches assign to each candidate residue a real-valued score that...

Descripción completa

Detalles Bibliográficos
Autores principales: Dong, Zhijie, Wang, Keyu, Linh Dang, Truong Khanh, Gültas, Mehmet, Welter, Marlon, Wierschin, Torsten, Stanke, Mario, Waack, Stephan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4150965/
https://www.ncbi.nlm.nih.gov/pubmed/25124108
http://dx.doi.org/10.1186/1471-2105-15-277
_version_ 1782332974757314560
author Dong, Zhijie
Wang, Keyu
Linh Dang, Truong Khanh
Gültas, Mehmet
Welter, Marlon
Wierschin, Torsten
Stanke, Mario
Waack, Stephan
author_facet Dong, Zhijie
Wang, Keyu
Linh Dang, Truong Khanh
Gültas, Mehmet
Welter, Marlon
Wierschin, Torsten
Stanke, Mario
Waack, Stephan
author_sort Dong, Zhijie
collection PubMed
description BACKGROUND: The identification of protein-protein interaction sites is a computationally challenging task and important for understanding the biology of protein complexes. There is a rich literature in this field. A broad class of approaches assign to each candidate residue a real-valued score that measures how likely it is that the residue belongs to the interface. The prediction is obtained by thresholding this score. Some probabilistic models classify the residues on the basis of the posterior probabilities. In this paper, we introduce pairwise conditional random fields (pCRFs) in which edges are not restricted to the backbone as in the case of linear-chain CRFs utilized by Li et al. (2007). In fact, any 3D-neighborhood relation can be modeled. On grounds of a generalized Viterbi inference algorithm and a piecewise training process for pCRFs, we demonstrate how to utilize pCRFs to enhance a given residue-wise score-based protein-protein interface predictor on the surface of the protein under study. The features of the pCRF are solely based on the interface predictions scores of the predictor the performance of which shall be improved. RESULTS: We performed three sets of experiments with synthetic scores assigned to the surface residues of proteins taken from the data set PlaneDimers compiled by Zellner et al. (2011), from the list published by Keskin et al. (2004) and from the very recent data set due to Cukuroglu et al. (2014). That way we demonstrated that our pCRF-based enhancer is effective given the interface residue score distribution and the non-interface residue score are unimodal. Moreover, the pCRF-based enhancer is also successfully applicable, if the distributions are only unimodal over a certain sub-domain. The improvement is then restricted to that domain. Thus we were able to improve the prediction of the PresCont server devised by Zellner et al. (2011) on PlaneDimers. CONCLUSIONS: Our results strongly suggest that pCRFs form a methodological framework to improve residue-wise score-based protein-protein interface predictors given the scores are appropriately distributed. A prototypical implementation of our method is accessible at http://ppicrf.informatik.uni-goettingen.de/index.html.
format Online
Article
Text
id pubmed-4150965
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41509652014-09-03 CRF-based models of protein surfaces improve protein-protein interaction site predictions Dong, Zhijie Wang, Keyu Linh Dang, Truong Khanh Gültas, Mehmet Welter, Marlon Wierschin, Torsten Stanke, Mario Waack, Stephan BMC Bioinformatics Methodology Article BACKGROUND: The identification of protein-protein interaction sites is a computationally challenging task and important for understanding the biology of protein complexes. There is a rich literature in this field. A broad class of approaches assign to each candidate residue a real-valued score that measures how likely it is that the residue belongs to the interface. The prediction is obtained by thresholding this score. Some probabilistic models classify the residues on the basis of the posterior probabilities. In this paper, we introduce pairwise conditional random fields (pCRFs) in which edges are not restricted to the backbone as in the case of linear-chain CRFs utilized by Li et al. (2007). In fact, any 3D-neighborhood relation can be modeled. On grounds of a generalized Viterbi inference algorithm and a piecewise training process for pCRFs, we demonstrate how to utilize pCRFs to enhance a given residue-wise score-based protein-protein interface predictor on the surface of the protein under study. The features of the pCRF are solely based on the interface predictions scores of the predictor the performance of which shall be improved. RESULTS: We performed three sets of experiments with synthetic scores assigned to the surface residues of proteins taken from the data set PlaneDimers compiled by Zellner et al. (2011), from the list published by Keskin et al. (2004) and from the very recent data set due to Cukuroglu et al. (2014). That way we demonstrated that our pCRF-based enhancer is effective given the interface residue score distribution and the non-interface residue score are unimodal. Moreover, the pCRF-based enhancer is also successfully applicable, if the distributions are only unimodal over a certain sub-domain. The improvement is then restricted to that domain. Thus we were able to improve the prediction of the PresCont server devised by Zellner et al. (2011) on PlaneDimers. CONCLUSIONS: Our results strongly suggest that pCRFs form a methodological framework to improve residue-wise score-based protein-protein interface predictors given the scores are appropriately distributed. A prototypical implementation of our method is accessible at http://ppicrf.informatik.uni-goettingen.de/index.html. BioMed Central 2014-08-13 /pmc/articles/PMC4150965/ /pubmed/25124108 http://dx.doi.org/10.1186/1471-2105-15-277 Text en © Dong et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Dong, Zhijie
Wang, Keyu
Linh Dang, Truong Khanh
Gültas, Mehmet
Welter, Marlon
Wierschin, Torsten
Stanke, Mario
Waack, Stephan
CRF-based models of protein surfaces improve protein-protein interaction site predictions
title CRF-based models of protein surfaces improve protein-protein interaction site predictions
title_full CRF-based models of protein surfaces improve protein-protein interaction site predictions
title_fullStr CRF-based models of protein surfaces improve protein-protein interaction site predictions
title_full_unstemmed CRF-based models of protein surfaces improve protein-protein interaction site predictions
title_short CRF-based models of protein surfaces improve protein-protein interaction site predictions
title_sort crf-based models of protein surfaces improve protein-protein interaction site predictions
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4150965/
https://www.ncbi.nlm.nih.gov/pubmed/25124108
http://dx.doi.org/10.1186/1471-2105-15-277
work_keys_str_mv AT dongzhijie crfbasedmodelsofproteinsurfacesimproveproteinproteininteractionsitepredictions
AT wangkeyu crfbasedmodelsofproteinsurfacesimproveproteinproteininteractionsitepredictions
AT linhdangtruongkhanh crfbasedmodelsofproteinsurfacesimproveproteinproteininteractionsitepredictions
AT gultasmehmet crfbasedmodelsofproteinsurfacesimproveproteinproteininteractionsitepredictions
AT weltermarlon crfbasedmodelsofproteinsurfacesimproveproteinproteininteractionsitepredictions
AT wierschintorsten crfbasedmodelsofproteinsurfacesimproveproteinproteininteractionsitepredictions
AT stankemario crfbasedmodelsofproteinsurfacesimproveproteinproteininteractionsitepredictions
AT waackstephan crfbasedmodelsofproteinsurfacesimproveproteinproteininteractionsitepredictions