Cargando…
Exploiting structural and topological information to improve prediction of RNA-protein binding sites
BACKGROUND: RNA-protein interactions are important for a wide range of biological processes. Current computational methods to predict interacting residues in RNA-protein interfaces predominately rely on sequence data. It is, however, known that interface residue propensity is closely correlated with...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2774325/ https://www.ncbi.nlm.nih.gov/pubmed/19835626 http://dx.doi.org/10.1186/1471-2105-10-341 |
_version_ | 1782173930105077760 |
---|---|
author | Maetschke, Stefan R Yuan, Zheng |
author_facet | Maetschke, Stefan R Yuan, Zheng |
author_sort | Maetschke, Stefan R |
collection | PubMed |
description | BACKGROUND: RNA-protein interactions are important for a wide range of biological processes. Current computational methods to predict interacting residues in RNA-protein interfaces predominately rely on sequence data. It is, however, known that interface residue propensity is closely correlated with structural properties. In this paper we systematically study information obtained from sequences and structures and compare their contributions in this prediction problem. Particularly, different geometrical and network topological properties of protein structures are evaluated to improve interface residue prediction accuracy. RESULTS: We have quantified the impact of structural information on the prediction accuracy in comparison to the purely sequence based approach using two machine learning techniques: Naïve Bayes classifiers and Support Vector Machines. The highest AUC of 0.83 was achieved by a Support Vector Machine, exploiting PSI-BLAST profile, accessible surface area, betweenness-centrality and retention coefficient as input features. Taking into account that our results are based on a larger non-redundant data set, the prediction accuracy is considerably higher than reported in previous, comparable studies. A protein-RNA interface predictor (PRIP) and the data set have been made available at . CONCLUSION: Graph-theoretic properties of residue contact maps derived from protein structures such as betweenness-centrality can supplement sequence or structure features to improve the prediction accuracy for binding residues in RNA-protein interactions. While Support Vector Machines perform better on this task, Naïve Bayes classifiers also have been found to achieve good prediction accuracies but require much less training time and are an attractive choice for large scale predictions. |
format | Text |
id | pubmed-2774325 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-27743252009-11-07 Exploiting structural and topological information to improve prediction of RNA-protein binding sites Maetschke, Stefan R Yuan, Zheng BMC Bioinformatics Research Article BACKGROUND: RNA-protein interactions are important for a wide range of biological processes. Current computational methods to predict interacting residues in RNA-protein interfaces predominately rely on sequence data. It is, however, known that interface residue propensity is closely correlated with structural properties. In this paper we systematically study information obtained from sequences and structures and compare their contributions in this prediction problem. Particularly, different geometrical and network topological properties of protein structures are evaluated to improve interface residue prediction accuracy. RESULTS: We have quantified the impact of structural information on the prediction accuracy in comparison to the purely sequence based approach using two machine learning techniques: Naïve Bayes classifiers and Support Vector Machines. The highest AUC of 0.83 was achieved by a Support Vector Machine, exploiting PSI-BLAST profile, accessible surface area, betweenness-centrality and retention coefficient as input features. Taking into account that our results are based on a larger non-redundant data set, the prediction accuracy is considerably higher than reported in previous, comparable studies. A protein-RNA interface predictor (PRIP) and the data set have been made available at . CONCLUSION: Graph-theoretic properties of residue contact maps derived from protein structures such as betweenness-centrality can supplement sequence or structure features to improve the prediction accuracy for binding residues in RNA-protein interactions. While Support Vector Machines perform better on this task, Naïve Bayes classifiers also have been found to achieve good prediction accuracies but require much less training time and are an attractive choice for large scale predictions. BioMed Central 2009-10-18 /pmc/articles/PMC2774325/ /pubmed/19835626 http://dx.doi.org/10.1186/1471-2105-10-341 Text en Copyright © 2009 Maetschke and Yuan; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Maetschke, Stefan R Yuan, Zheng Exploiting structural and topological information to improve prediction of RNA-protein binding sites |
title | Exploiting structural and topological information to improve prediction of RNA-protein binding sites |
title_full | Exploiting structural and topological information to improve prediction of RNA-protein binding sites |
title_fullStr | Exploiting structural and topological information to improve prediction of RNA-protein binding sites |
title_full_unstemmed | Exploiting structural and topological information to improve prediction of RNA-protein binding sites |
title_short | Exploiting structural and topological information to improve prediction of RNA-protein binding sites |
title_sort | exploiting structural and topological information to improve prediction of rna-protein binding sites |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2774325/ https://www.ncbi.nlm.nih.gov/pubmed/19835626 http://dx.doi.org/10.1186/1471-2105-10-341 |
work_keys_str_mv | AT maetschkestefanr exploitingstructuralandtopologicalinformationtoimprovepredictionofrnaproteinbindingsites AT yuanzheng exploitingstructuralandtopologicalinformationtoimprovepredictionofrnaproteinbindingsites |