Cargando…

Exploiting structural and topological information to improve prediction of RNA-protein binding sites

BACKGROUND: RNA-protein interactions are important for a wide range of biological processes. Current computational methods to predict interacting residues in RNA-protein interfaces predominately rely on sequence data. It is, however, known that interface residue propensity is closely correlated with...

Descripción completa

Detalles Bibliográficos
Autores principales: Maetschke, Stefan R, Yuan, Zheng
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2774325/
https://www.ncbi.nlm.nih.gov/pubmed/19835626
http://dx.doi.org/10.1186/1471-2105-10-341
_version_ 1782173930105077760
author Maetschke, Stefan R
Yuan, Zheng
author_facet Maetschke, Stefan R
Yuan, Zheng
author_sort Maetschke, Stefan R
collection PubMed
description BACKGROUND: RNA-protein interactions are important for a wide range of biological processes. Current computational methods to predict interacting residues in RNA-protein interfaces predominately rely on sequence data. It is, however, known that interface residue propensity is closely correlated with structural properties. In this paper we systematically study information obtained from sequences and structures and compare their contributions in this prediction problem. Particularly, different geometrical and network topological properties of protein structures are evaluated to improve interface residue prediction accuracy. RESULTS: We have quantified the impact of structural information on the prediction accuracy in comparison to the purely sequence based approach using two machine learning techniques: Naïve Bayes classifiers and Support Vector Machines. The highest AUC of 0.83 was achieved by a Support Vector Machine, exploiting PSI-BLAST profile, accessible surface area, betweenness-centrality and retention coefficient as input features. Taking into account that our results are based on a larger non-redundant data set, the prediction accuracy is considerably higher than reported in previous, comparable studies. A protein-RNA interface predictor (PRIP) and the data set have been made available at . CONCLUSION: Graph-theoretic properties of residue contact maps derived from protein structures such as betweenness-centrality can supplement sequence or structure features to improve the prediction accuracy for binding residues in RNA-protein interactions. While Support Vector Machines perform better on this task, Naïve Bayes classifiers also have been found to achieve good prediction accuracies but require much less training time and are an attractive choice for large scale predictions.
format Text
id pubmed-2774325
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27743252009-11-07 Exploiting structural and topological information to improve prediction of RNA-protein binding sites Maetschke, Stefan R Yuan, Zheng BMC Bioinformatics Research Article BACKGROUND: RNA-protein interactions are important for a wide range of biological processes. Current computational methods to predict interacting residues in RNA-protein interfaces predominately rely on sequence data. It is, however, known that interface residue propensity is closely correlated with structural properties. In this paper we systematically study information obtained from sequences and structures and compare their contributions in this prediction problem. Particularly, different geometrical and network topological properties of protein structures are evaluated to improve interface residue prediction accuracy. RESULTS: We have quantified the impact of structural information on the prediction accuracy in comparison to the purely sequence based approach using two machine learning techniques: Naïve Bayes classifiers and Support Vector Machines. The highest AUC of 0.83 was achieved by a Support Vector Machine, exploiting PSI-BLAST profile, accessible surface area, betweenness-centrality and retention coefficient as input features. Taking into account that our results are based on a larger non-redundant data set, the prediction accuracy is considerably higher than reported in previous, comparable studies. A protein-RNA interface predictor (PRIP) and the data set have been made available at . CONCLUSION: Graph-theoretic properties of residue contact maps derived from protein structures such as betweenness-centrality can supplement sequence or structure features to improve the prediction accuracy for binding residues in RNA-protein interactions. While Support Vector Machines perform better on this task, Naïve Bayes classifiers also have been found to achieve good prediction accuracies but require much less training time and are an attractive choice for large scale predictions. BioMed Central 2009-10-18 /pmc/articles/PMC2774325/ /pubmed/19835626 http://dx.doi.org/10.1186/1471-2105-10-341 Text en Copyright © 2009 Maetschke and Yuan; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Maetschke, Stefan R
Yuan, Zheng
Exploiting structural and topological information to improve prediction of RNA-protein binding sites
title Exploiting structural and topological information to improve prediction of RNA-protein binding sites
title_full Exploiting structural and topological information to improve prediction of RNA-protein binding sites
title_fullStr Exploiting structural and topological information to improve prediction of RNA-protein binding sites
title_full_unstemmed Exploiting structural and topological information to improve prediction of RNA-protein binding sites
title_short Exploiting structural and topological information to improve prediction of RNA-protein binding sites
title_sort exploiting structural and topological information to improve prediction of rna-protein binding sites
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2774325/
https://www.ncbi.nlm.nih.gov/pubmed/19835626
http://dx.doi.org/10.1186/1471-2105-10-341
work_keys_str_mv AT maetschkestefanr exploitingstructuralandtopologicalinformationtoimprovepredictionofrnaproteinbindingsites
AT yuanzheng exploitingstructuralandtopologicalinformationtoimprovepredictionofrnaproteinbindingsites