Cargando…
Prediction of binding property of RNA-binding proteins using multi-sized filters and multi-modal deep convolutional neural network
RNA-binding proteins (RBPs) are important in gene expression regulations by post-transcriptional control of RNAs and immune system development and its function. Due to the help of sequencing technology, numerous RNA sequences are newly discovered without knowing their binding partner RBPs. Therefore...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6485761/ https://www.ncbi.nlm.nih.gov/pubmed/31026297 http://dx.doi.org/10.1371/journal.pone.0216257 |
_version_ | 1783414297538330624 |
---|---|
author | Chung, Taesu Kim, Dongsup |
author_facet | Chung, Taesu Kim, Dongsup |
author_sort | Chung, Taesu |
collection | PubMed |
description | RNA-binding proteins (RBPs) are important in gene expression regulations by post-transcriptional control of RNAs and immune system development and its function. Due to the help of sequencing technology, numerous RNA sequences are newly discovered without knowing their binding partner RBPs. Therefore, demands for accurate prediction method for RBP binding sites are increasing. There are many attempts for RBP binding site predictions using various machine-learning techniques combined with various RNA features. In this work, we present a new deep convolution neural network model trained on CLIP-seq datasets using multi-sized filters and multi-modal features to predict the binding property of RBPs. With this model, we integrated sequence and structure information to extract sequence motifs, structure motifs, and combined motifs at the same time. The RBP binding site prediction on RBP-24 dataset was compared with two multi-modal methods, GraphProt and Deepnet-rbp, using area under curve (AUC) of receiver-operating characteristics (ROC). Our method (average AUC = 0.920) outperformed 20 RBPs with GraphProt (average AUC = 0.888) and 15 RBP with Deepnet-rbp (average AUC = 0.902). The improvement was achieved by using multi-sized convolution filters, where average relative error reduction was 17%. By introducing new RNA structure representation, structure probability matrix, average relative error was reduced by 3% when compared to one-hot encoded secondary structure representation. Interestingly, structure probability matrix was more effective on ALKBH5, where relative error reduction was 30%. We developed new sequence motif enrichment method, which we stated as response enrichment method. We successfully enriched sequence motif for 12 RBPs, which had high resemblance with other literature evidences, RBPgroup and CISBP-RNA. Finally by analyzing these results altogether, we found intricate interplay between sequence motif and structure motif, which agreed with other researches. |
format | Online Article Text |
id | pubmed-6485761 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-64857612019-05-09 Prediction of binding property of RNA-binding proteins using multi-sized filters and multi-modal deep convolutional neural network Chung, Taesu Kim, Dongsup PLoS One Research Article RNA-binding proteins (RBPs) are important in gene expression regulations by post-transcriptional control of RNAs and immune system development and its function. Due to the help of sequencing technology, numerous RNA sequences are newly discovered without knowing their binding partner RBPs. Therefore, demands for accurate prediction method for RBP binding sites are increasing. There are many attempts for RBP binding site predictions using various machine-learning techniques combined with various RNA features. In this work, we present a new deep convolution neural network model trained on CLIP-seq datasets using multi-sized filters and multi-modal features to predict the binding property of RBPs. With this model, we integrated sequence and structure information to extract sequence motifs, structure motifs, and combined motifs at the same time. The RBP binding site prediction on RBP-24 dataset was compared with two multi-modal methods, GraphProt and Deepnet-rbp, using area under curve (AUC) of receiver-operating characteristics (ROC). Our method (average AUC = 0.920) outperformed 20 RBPs with GraphProt (average AUC = 0.888) and 15 RBP with Deepnet-rbp (average AUC = 0.902). The improvement was achieved by using multi-sized convolution filters, where average relative error reduction was 17%. By introducing new RNA structure representation, structure probability matrix, average relative error was reduced by 3% when compared to one-hot encoded secondary structure representation. Interestingly, structure probability matrix was more effective on ALKBH5, where relative error reduction was 30%. We developed new sequence motif enrichment method, which we stated as response enrichment method. We successfully enriched sequence motif for 12 RBPs, which had high resemblance with other literature evidences, RBPgroup and CISBP-RNA. Finally by analyzing these results altogether, we found intricate interplay between sequence motif and structure motif, which agreed with other researches. Public Library of Science 2019-04-26 /pmc/articles/PMC6485761/ /pubmed/31026297 http://dx.doi.org/10.1371/journal.pone.0216257 Text en © 2019 Chung, Kim http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Chung, Taesu Kim, Dongsup Prediction of binding property of RNA-binding proteins using multi-sized filters and multi-modal deep convolutional neural network |
title | Prediction of binding property of RNA-binding proteins using multi-sized filters and multi-modal deep convolutional neural network |
title_full | Prediction of binding property of RNA-binding proteins using multi-sized filters and multi-modal deep convolutional neural network |
title_fullStr | Prediction of binding property of RNA-binding proteins using multi-sized filters and multi-modal deep convolutional neural network |
title_full_unstemmed | Prediction of binding property of RNA-binding proteins using multi-sized filters and multi-modal deep convolutional neural network |
title_short | Prediction of binding property of RNA-binding proteins using multi-sized filters and multi-modal deep convolutional neural network |
title_sort | prediction of binding property of rna-binding proteins using multi-sized filters and multi-modal deep convolutional neural network |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6485761/ https://www.ncbi.nlm.nih.gov/pubmed/31026297 http://dx.doi.org/10.1371/journal.pone.0216257 |
work_keys_str_mv | AT chungtaesu predictionofbindingpropertyofrnabindingproteinsusingmultisizedfiltersandmultimodaldeepconvolutionalneuralnetwork AT kimdongsup predictionofbindingpropertyofrnabindingproteinsusingmultisizedfiltersandmultimodaldeepconvolutionalneuralnetwork |