Cargando…

Prediction of binding property of RNA-binding proteins using multi-sized filters and multi-modal deep convolutional neural network

RNA-binding proteins (RBPs) are important in gene expression regulations by post-transcriptional control of RNAs and immune system development and its function. Due to the help of sequencing technology, numerous RNA sequences are newly discovered without knowing their binding partner RBPs. Therefore...

Descripción completa

Detalles Bibliográficos
Autores principales: Chung, Taesu, Kim, Dongsup
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6485761/
https://www.ncbi.nlm.nih.gov/pubmed/31026297
http://dx.doi.org/10.1371/journal.pone.0216257
_version_ 1783414297538330624
author Chung, Taesu
Kim, Dongsup
author_facet Chung, Taesu
Kim, Dongsup
author_sort Chung, Taesu
collection PubMed
description RNA-binding proteins (RBPs) are important in gene expression regulations by post-transcriptional control of RNAs and immune system development and its function. Due to the help of sequencing technology, numerous RNA sequences are newly discovered without knowing their binding partner RBPs. Therefore, demands for accurate prediction method for RBP binding sites are increasing. There are many attempts for RBP binding site predictions using various machine-learning techniques combined with various RNA features. In this work, we present a new deep convolution neural network model trained on CLIP-seq datasets using multi-sized filters and multi-modal features to predict the binding property of RBPs. With this model, we integrated sequence and structure information to extract sequence motifs, structure motifs, and combined motifs at the same time. The RBP binding site prediction on RBP-24 dataset was compared with two multi-modal methods, GraphProt and Deepnet-rbp, using area under curve (AUC) of receiver-operating characteristics (ROC). Our method (average AUC = 0.920) outperformed 20 RBPs with GraphProt (average AUC = 0.888) and 15 RBP with Deepnet-rbp (average AUC = 0.902). The improvement was achieved by using multi-sized convolution filters, where average relative error reduction was 17%. By introducing new RNA structure representation, structure probability matrix, average relative error was reduced by 3% when compared to one-hot encoded secondary structure representation. Interestingly, structure probability matrix was more effective on ALKBH5, where relative error reduction was 30%. We developed new sequence motif enrichment method, which we stated as response enrichment method. We successfully enriched sequence motif for 12 RBPs, which had high resemblance with other literature evidences, RBPgroup and CISBP-RNA. Finally by analyzing these results altogether, we found intricate interplay between sequence motif and structure motif, which agreed with other researches.
format Online
Article
Text
id pubmed-6485761
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-64857612019-05-09 Prediction of binding property of RNA-binding proteins using multi-sized filters and multi-modal deep convolutional neural network Chung, Taesu Kim, Dongsup PLoS One Research Article RNA-binding proteins (RBPs) are important in gene expression regulations by post-transcriptional control of RNAs and immune system development and its function. Due to the help of sequencing technology, numerous RNA sequences are newly discovered without knowing their binding partner RBPs. Therefore, demands for accurate prediction method for RBP binding sites are increasing. There are many attempts for RBP binding site predictions using various machine-learning techniques combined with various RNA features. In this work, we present a new deep convolution neural network model trained on CLIP-seq datasets using multi-sized filters and multi-modal features to predict the binding property of RBPs. With this model, we integrated sequence and structure information to extract sequence motifs, structure motifs, and combined motifs at the same time. The RBP binding site prediction on RBP-24 dataset was compared with two multi-modal methods, GraphProt and Deepnet-rbp, using area under curve (AUC) of receiver-operating characteristics (ROC). Our method (average AUC = 0.920) outperformed 20 RBPs with GraphProt (average AUC = 0.888) and 15 RBP with Deepnet-rbp (average AUC = 0.902). The improvement was achieved by using multi-sized convolution filters, where average relative error reduction was 17%. By introducing new RNA structure representation, structure probability matrix, average relative error was reduced by 3% when compared to one-hot encoded secondary structure representation. Interestingly, structure probability matrix was more effective on ALKBH5, where relative error reduction was 30%. We developed new sequence motif enrichment method, which we stated as response enrichment method. We successfully enriched sequence motif for 12 RBPs, which had high resemblance with other literature evidences, RBPgroup and CISBP-RNA. Finally by analyzing these results altogether, we found intricate interplay between sequence motif and structure motif, which agreed with other researches. Public Library of Science 2019-04-26 /pmc/articles/PMC6485761/ /pubmed/31026297 http://dx.doi.org/10.1371/journal.pone.0216257 Text en © 2019 Chung, Kim http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Chung, Taesu
Kim, Dongsup
Prediction of binding property of RNA-binding proteins using multi-sized filters and multi-modal deep convolutional neural network
title Prediction of binding property of RNA-binding proteins using multi-sized filters and multi-modal deep convolutional neural network
title_full Prediction of binding property of RNA-binding proteins using multi-sized filters and multi-modal deep convolutional neural network
title_fullStr Prediction of binding property of RNA-binding proteins using multi-sized filters and multi-modal deep convolutional neural network
title_full_unstemmed Prediction of binding property of RNA-binding proteins using multi-sized filters and multi-modal deep convolutional neural network
title_short Prediction of binding property of RNA-binding proteins using multi-sized filters and multi-modal deep convolutional neural network
title_sort prediction of binding property of rna-binding proteins using multi-sized filters and multi-modal deep convolutional neural network
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6485761/
https://www.ncbi.nlm.nih.gov/pubmed/31026297
http://dx.doi.org/10.1371/journal.pone.0216257
work_keys_str_mv AT chungtaesu predictionofbindingpropertyofrnabindingproteinsusingmultisizedfiltersandmultimodaldeepconvolutionalneuralnetwork
AT kimdongsup predictionofbindingpropertyofrnabindingproteinsusingmultisizedfiltersandmultimodaldeepconvolutionalneuralnetwork