Cargando…
Modeling in-vivo protein-DNA binding by combining multiple-instance learning with a hybrid deep neural network
Modeling in-vivo protein-DNA binding is not only fundamental for further understanding of the regulatory mechanisms, but also a challenging task in computational biology. Deep-learning based methods have succeed in modeling in-vivo protein-DNA binding, but they often (1) follow the fully supervised...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6559991/ https://www.ncbi.nlm.nih.gov/pubmed/31186519 http://dx.doi.org/10.1038/s41598-019-44966-x |
_version_ | 1783425877722267648 |
---|---|
author | Zhang, Qinhu Shen, Zhen Huang, De-Shuang |
author_facet | Zhang, Qinhu Shen, Zhen Huang, De-Shuang |
author_sort | Zhang, Qinhu |
collection | PubMed |
description | Modeling in-vivo protein-DNA binding is not only fundamental for further understanding of the regulatory mechanisms, but also a challenging task in computational biology. Deep-learning based methods have succeed in modeling in-vivo protein-DNA binding, but they often (1) follow the fully supervised learning framework and overlook the weakly supervised information of genomic sequences that a bound DNA sequence may has multiple TFBS(s), and, (2) use one-hot encoding to encode DNA sequences and ignore the dependencies among nucleotides. In this paper, we propose a weakly supervised framework, which combines multiple-instance learning with a hybrid deep neural network and uses k-mer encoding to transform DNA sequences, for modeling in-vivo protein-DNA binding. Firstly, this framework segments sequences into multiple overlapping instances using a sliding window, and then encodes all instances into image-like inputs of high-order dependencies using k-mer encoding. Secondly, it separately computes a score for all instances in the same bag using a hybrid deep neural network that integrates convolutional and recurrent neural networks. Finally, it integrates the predicted values of all instances as the final prediction of this bag using the Noisy-and method. The experimental results on in-vivo datasets demonstrate the superior performance of the proposed framework. In addition, we also explore the performance of the proposed framework when using k-mer encoding, and demonstrate the performance of the Noisy-and method by comparing it with other fusion methods, and find that adding recurrent layers can improve the performance of the proposed framework. |
format | Online Article Text |
id | pubmed-6559991 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-65599912019-06-19 Modeling in-vivo protein-DNA binding by combining multiple-instance learning with a hybrid deep neural network Zhang, Qinhu Shen, Zhen Huang, De-Shuang Sci Rep Article Modeling in-vivo protein-DNA binding is not only fundamental for further understanding of the regulatory mechanisms, but also a challenging task in computational biology. Deep-learning based methods have succeed in modeling in-vivo protein-DNA binding, but they often (1) follow the fully supervised learning framework and overlook the weakly supervised information of genomic sequences that a bound DNA sequence may has multiple TFBS(s), and, (2) use one-hot encoding to encode DNA sequences and ignore the dependencies among nucleotides. In this paper, we propose a weakly supervised framework, which combines multiple-instance learning with a hybrid deep neural network and uses k-mer encoding to transform DNA sequences, for modeling in-vivo protein-DNA binding. Firstly, this framework segments sequences into multiple overlapping instances using a sliding window, and then encodes all instances into image-like inputs of high-order dependencies using k-mer encoding. Secondly, it separately computes a score for all instances in the same bag using a hybrid deep neural network that integrates convolutional and recurrent neural networks. Finally, it integrates the predicted values of all instances as the final prediction of this bag using the Noisy-and method. The experimental results on in-vivo datasets demonstrate the superior performance of the proposed framework. In addition, we also explore the performance of the proposed framework when using k-mer encoding, and demonstrate the performance of the Noisy-and method by comparing it with other fusion methods, and find that adding recurrent layers can improve the performance of the proposed framework. Nature Publishing Group UK 2019-06-11 /pmc/articles/PMC6559991/ /pubmed/31186519 http://dx.doi.org/10.1038/s41598-019-44966-x Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Zhang, Qinhu Shen, Zhen Huang, De-Shuang Modeling in-vivo protein-DNA binding by combining multiple-instance learning with a hybrid deep neural network |
title | Modeling in-vivo protein-DNA binding by combining multiple-instance learning with a hybrid deep neural network |
title_full | Modeling in-vivo protein-DNA binding by combining multiple-instance learning with a hybrid deep neural network |
title_fullStr | Modeling in-vivo protein-DNA binding by combining multiple-instance learning with a hybrid deep neural network |
title_full_unstemmed | Modeling in-vivo protein-DNA binding by combining multiple-instance learning with a hybrid deep neural network |
title_short | Modeling in-vivo protein-DNA binding by combining multiple-instance learning with a hybrid deep neural network |
title_sort | modeling in-vivo protein-dna binding by combining multiple-instance learning with a hybrid deep neural network |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6559991/ https://www.ncbi.nlm.nih.gov/pubmed/31186519 http://dx.doi.org/10.1038/s41598-019-44966-x |
work_keys_str_mv | AT zhangqinhu modelinginvivoproteindnabindingbycombiningmultipleinstancelearningwithahybriddeepneuralnetwork AT shenzhen modelinginvivoproteindnabindingbycombiningmultipleinstancelearningwithahybriddeepneuralnetwork AT huangdeshuang modelinginvivoproteindnabindingbycombiningmultipleinstancelearningwithahybriddeepneuralnetwork |