Cargando…

TriPepSVM: de novo prediction of RNA-binding proteins based on short amino acid motifs

In recent years, hundreds of novel RNA-binding proteins (RBPs) have been identified, leading to the discovery of novel RNA-binding domains. Furthermore, unstructured or disordered low-complexity regions of RBPs have been identified to play an important role in interactions with nucleic acids. Howeve...

Descripción completa

Detalles Bibliográficos
Autores principales: Bressin, Annkatrin, Schulte-Sasse, Roman, Figini, Davide, Urdaneta, Erika C, Beckmann, Benedikt M, Marsico, Annalisa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6511874/
https://www.ncbi.nlm.nih.gov/pubmed/30923827
http://dx.doi.org/10.1093/nar/gkz203
_version_ 1783417617606770688
author Bressin, Annkatrin
Schulte-Sasse, Roman
Figini, Davide
Urdaneta, Erika C
Beckmann, Benedikt M
Marsico, Annalisa
author_facet Bressin, Annkatrin
Schulte-Sasse, Roman
Figini, Davide
Urdaneta, Erika C
Beckmann, Benedikt M
Marsico, Annalisa
author_sort Bressin, Annkatrin
collection PubMed
description In recent years, hundreds of novel RNA-binding proteins (RBPs) have been identified, leading to the discovery of novel RNA-binding domains. Furthermore, unstructured or disordered low-complexity regions of RBPs have been identified to play an important role in interactions with nucleic acids. However, these advances in understanding RBPs are limited mainly to eukaryotic species and we only have limited tools to faithfully predict RNA-binders in bacteria. Here, we describe a support vector machine-based method, called TriPepSVM, for the prediction of RNA-binding proteins. TriPepSVM applies string kernels to directly handle protein sequences using tri-peptide frequencies. Testing the method in human and bacteria, we find that several RBP-enriched tri-peptides occur more often in structurally disordered regions of RBPs. TriPepSVM outperforms existing applications, which consider classical structural features of RNA-binding or homology, in the task of RBP prediction in both human and bacteria. Finally, we predict 66 novel RBPs in Salmonella Typhimurium and validate the bacterial proteins ClpX, DnaJ and UbiG to associate with RNA in vivo.
format Online
Article
Text
id pubmed-6511874
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-65118742019-05-20 TriPepSVM: de novo prediction of RNA-binding proteins based on short amino acid motifs Bressin, Annkatrin Schulte-Sasse, Roman Figini, Davide Urdaneta, Erika C Beckmann, Benedikt M Marsico, Annalisa Nucleic Acids Res Computational Biology In recent years, hundreds of novel RNA-binding proteins (RBPs) have been identified, leading to the discovery of novel RNA-binding domains. Furthermore, unstructured or disordered low-complexity regions of RBPs have been identified to play an important role in interactions with nucleic acids. However, these advances in understanding RBPs are limited mainly to eukaryotic species and we only have limited tools to faithfully predict RNA-binders in bacteria. Here, we describe a support vector machine-based method, called TriPepSVM, for the prediction of RNA-binding proteins. TriPepSVM applies string kernels to directly handle protein sequences using tri-peptide frequencies. Testing the method in human and bacteria, we find that several RBP-enriched tri-peptides occur more often in structurally disordered regions of RBPs. TriPepSVM outperforms existing applications, which consider classical structural features of RNA-binding or homology, in the task of RBP prediction in both human and bacteria. Finally, we predict 66 novel RBPs in Salmonella Typhimurium and validate the bacterial proteins ClpX, DnaJ and UbiG to associate with RNA in vivo. Oxford University Press 2019-05-21 2019-03-29 /pmc/articles/PMC6511874/ /pubmed/30923827 http://dx.doi.org/10.1093/nar/gkz203 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Computational Biology
Bressin, Annkatrin
Schulte-Sasse, Roman
Figini, Davide
Urdaneta, Erika C
Beckmann, Benedikt M
Marsico, Annalisa
TriPepSVM: de novo prediction of RNA-binding proteins based on short amino acid motifs
title TriPepSVM: de novo prediction of RNA-binding proteins based on short amino acid motifs
title_full TriPepSVM: de novo prediction of RNA-binding proteins based on short amino acid motifs
title_fullStr TriPepSVM: de novo prediction of RNA-binding proteins based on short amino acid motifs
title_full_unstemmed TriPepSVM: de novo prediction of RNA-binding proteins based on short amino acid motifs
title_short TriPepSVM: de novo prediction of RNA-binding proteins based on short amino acid motifs
title_sort tripepsvm: de novo prediction of rna-binding proteins based on short amino acid motifs
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6511874/
https://www.ncbi.nlm.nih.gov/pubmed/30923827
http://dx.doi.org/10.1093/nar/gkz203
work_keys_str_mv AT bressinannkatrin tripepsvmdenovopredictionofrnabindingproteinsbasedonshortaminoacidmotifs
AT schultesasseroman tripepsvmdenovopredictionofrnabindingproteinsbasedonshortaminoacidmotifs
AT figinidavide tripepsvmdenovopredictionofrnabindingproteinsbasedonshortaminoacidmotifs
AT urdanetaerikac tripepsvmdenovopredictionofrnabindingproteinsbasedonshortaminoacidmotifs
AT beckmannbenediktm tripepsvmdenovopredictionofrnabindingproteinsbasedonshortaminoacidmotifs
AT marsicoannalisa tripepsvmdenovopredictionofrnabindingproteinsbasedonshortaminoacidmotifs