Cargando…

A graph kernel approach for alignment-free domain–peptide interaction prediction with an application to human SH3 domains

Motivation: State-of-the-art experimental data for determining binding specificities of peptide recognition modules (PRMs) is obtained by high-throughput approaches like peptide arrays. Most prediction tools applicable to this kind of data are based on an initial multiple alignment of the peptide li...

Descripción completa

Detalles Bibliográficos
Autores principales: Kundu, Kousik, Costa, Fabrizio, Backofen, Rolf
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3694653/
https://www.ncbi.nlm.nih.gov/pubmed/23813002
http://dx.doi.org/10.1093/bioinformatics/btt220
_version_ 1782274882053079040
author Kundu, Kousik
Costa, Fabrizio
Backofen, Rolf
author_facet Kundu, Kousik
Costa, Fabrizio
Backofen, Rolf
author_sort Kundu, Kousik
collection PubMed
description Motivation: State-of-the-art experimental data for determining binding specificities of peptide recognition modules (PRMs) is obtained by high-throughput approaches like peptide arrays. Most prediction tools applicable to this kind of data are based on an initial multiple alignment of the peptide ligands. Building an initial alignment can be error-prone, especially in the case of the proline-rich peptides bound by the SH3 domains. Results: Here, we present a machine-learning approach based on an efficient graph-kernel technique to predict the specificity of a large set of 70 human SH3 domains, which are an important class of PRMs. The graph-kernel strategy allows us to (i) integrate several types of physico-chemical information for each amino acid, (ii) consider high-order correlations between these features and (iii) eliminate the need for an initial peptide alignment. We build specialized models for each human SH3 domain and achieve competitive predictive performance of 0.73 area under precision-recall curve, compared with 0.27 area under precision-recall curve for state-of-the-art methods based on position weight matrices. We show that better models can be obtained when we use information on the noninteracting peptides (negative examples), which is currently not used by the state-of-the art approaches based on position weight matrices. To this end, we analyze two strategies to identify subsets of high confidence negative data. The techniques introduced here are more general and hence can also be used for any other protein domains, which interact with short peptides (i.e. other PRMs). Availability: The program with the predictive models can be found at http://www.bioinf.uni-freiburg.de/Software/SH3PepInt/SH3PepInt.tar.gz. We also provide a genome-wide prediction for all 70 human SH3 domains, which can be found under http://www.bioinf.uni-freiburg.de/Software/SH3PepInt/Genome-Wide-Predictions.tar.gz. Contact: backofen@informatik.uni-freiburg.de Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-3694653
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-36946532013-06-27 A graph kernel approach for alignment-free domain–peptide interaction prediction with an application to human SH3 domains Kundu, Kousik Costa, Fabrizio Backofen, Rolf Bioinformatics Ismb/Eccb 2013 Proceedings Papers Committee July 21 to July 23, 2013, Berlin, Germany Motivation: State-of-the-art experimental data for determining binding specificities of peptide recognition modules (PRMs) is obtained by high-throughput approaches like peptide arrays. Most prediction tools applicable to this kind of data are based on an initial multiple alignment of the peptide ligands. Building an initial alignment can be error-prone, especially in the case of the proline-rich peptides bound by the SH3 domains. Results: Here, we present a machine-learning approach based on an efficient graph-kernel technique to predict the specificity of a large set of 70 human SH3 domains, which are an important class of PRMs. The graph-kernel strategy allows us to (i) integrate several types of physico-chemical information for each amino acid, (ii) consider high-order correlations between these features and (iii) eliminate the need for an initial peptide alignment. We build specialized models for each human SH3 domain and achieve competitive predictive performance of 0.73 area under precision-recall curve, compared with 0.27 area under precision-recall curve for state-of-the-art methods based on position weight matrices. We show that better models can be obtained when we use information on the noninteracting peptides (negative examples), which is currently not used by the state-of-the art approaches based on position weight matrices. To this end, we analyze two strategies to identify subsets of high confidence negative data. The techniques introduced here are more general and hence can also be used for any other protein domains, which interact with short peptides (i.e. other PRMs). Availability: The program with the predictive models can be found at http://www.bioinf.uni-freiburg.de/Software/SH3PepInt/SH3PepInt.tar.gz. We also provide a genome-wide prediction for all 70 human SH3 domains, which can be found under http://www.bioinf.uni-freiburg.de/Software/SH3PepInt/Genome-Wide-Predictions.tar.gz. Contact: backofen@informatik.uni-freiburg.de Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2013-07-01 2013-06-19 /pmc/articles/PMC3694653/ /pubmed/23813002 http://dx.doi.org/10.1093/bioinformatics/btt220 Text en © The Author 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb/Eccb 2013 Proceedings Papers Committee July 21 to July 23, 2013, Berlin, Germany
Kundu, Kousik
Costa, Fabrizio
Backofen, Rolf
A graph kernel approach for alignment-free domain–peptide interaction prediction with an application to human SH3 domains
title A graph kernel approach for alignment-free domain–peptide interaction prediction with an application to human SH3 domains
title_full A graph kernel approach for alignment-free domain–peptide interaction prediction with an application to human SH3 domains
title_fullStr A graph kernel approach for alignment-free domain–peptide interaction prediction with an application to human SH3 domains
title_full_unstemmed A graph kernel approach for alignment-free domain–peptide interaction prediction with an application to human SH3 domains
title_short A graph kernel approach for alignment-free domain–peptide interaction prediction with an application to human SH3 domains
title_sort graph kernel approach for alignment-free domain–peptide interaction prediction with an application to human sh3 domains
topic Ismb/Eccb 2013 Proceedings Papers Committee July 21 to July 23, 2013, Berlin, Germany
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3694653/
https://www.ncbi.nlm.nih.gov/pubmed/23813002
http://dx.doi.org/10.1093/bioinformatics/btt220
work_keys_str_mv AT kundukousik agraphkernelapproachforalignmentfreedomainpeptideinteractionpredictionwithanapplicationtohumansh3domains
AT costafabrizio agraphkernelapproachforalignmentfreedomainpeptideinteractionpredictionwithanapplicationtohumansh3domains
AT backofenrolf agraphkernelapproachforalignmentfreedomainpeptideinteractionpredictionwithanapplicationtohumansh3domains
AT kundukousik graphkernelapproachforalignmentfreedomainpeptideinteractionpredictionwithanapplicationtohumansh3domains
AT costafabrizio graphkernelapproachforalignmentfreedomainpeptideinteractionpredictionwithanapplicationtohumansh3domains
AT backofenrolf graphkernelapproachforalignmentfreedomainpeptideinteractionpredictionwithanapplicationtohumansh3domains