Cargando…

Locating transcription factor binding sites by fully convolutional neural network

Transcription factors (TFs) play an important role in regulating gene expression, thus identification of the regions bound by them has become a fundamental step for molecular and cellular biology. In recent years, an increasing number of deep learning (DL) based methods have been proposed for predic...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Qinhu, Wang, Siguo, Chen, Zhanheng, He, Ying, Liu, Qi, Huang, De-Shuang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8425303/
https://www.ncbi.nlm.nih.gov/pubmed/33498086
http://dx.doi.org/10.1093/bib/bbaa435
_version_ 1783749827659563008
author Zhang, Qinhu
Wang, Siguo
Chen, Zhanheng
He, Ying
Liu, Qi
Huang, De-Shuang
author_facet Zhang, Qinhu
Wang, Siguo
Chen, Zhanheng
He, Ying
Liu, Qi
Huang, De-Shuang
author_sort Zhang, Qinhu
collection PubMed
description Transcription factors (TFs) play an important role in regulating gene expression, thus identification of the regions bound by them has become a fundamental step for molecular and cellular biology. In recent years, an increasing number of deep learning (DL) based methods have been proposed for predicting TF binding sites (TFBSs) and achieved impressive prediction performance. However, these methods mainly focus on predicting the sequence specificity of TF-DNA binding, which is equivalent to a sequence-level binary classification task, and fail to identify motifs and TFBSs accurately. In this paper, we developed a fully convolutional network coupled with global average pooling (FCNA), which by contrast is equivalent to a nucleotide-level binary classification task, to roughly locate TFBSs and accurately identify motifs. Experimental results on human ChIP-seq datasets show that FCNA outperforms other competing methods significantly. Besides, we find that the regions located by FCNA can be used by motif discovery tools to further refine the prediction performance. Furthermore, we observe that FCNA can accurately identify TF-DNA binding motifs across different cell lines and infer indirect TF-DNA bindings.
format Online
Article
Text
id pubmed-8425303
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-84253032021-09-09 Locating transcription factor binding sites by fully convolutional neural network Zhang, Qinhu Wang, Siguo Chen, Zhanheng He, Ying Liu, Qi Huang, De-Shuang Brief Bioinform Problem Solving Protocol Transcription factors (TFs) play an important role in regulating gene expression, thus identification of the regions bound by them has become a fundamental step for molecular and cellular biology. In recent years, an increasing number of deep learning (DL) based methods have been proposed for predicting TF binding sites (TFBSs) and achieved impressive prediction performance. However, these methods mainly focus on predicting the sequence specificity of TF-DNA binding, which is equivalent to a sequence-level binary classification task, and fail to identify motifs and TFBSs accurately. In this paper, we developed a fully convolutional network coupled with global average pooling (FCNA), which by contrast is equivalent to a nucleotide-level binary classification task, to roughly locate TFBSs and accurately identify motifs. Experimental results on human ChIP-seq datasets show that FCNA outperforms other competing methods significantly. Besides, we find that the regions located by FCNA can be used by motif discovery tools to further refine the prediction performance. Furthermore, we observe that FCNA can accurately identify TF-DNA binding motifs across different cell lines and infer indirect TF-DNA bindings. Oxford University Press 2021-01-26 /pmc/articles/PMC8425303/ /pubmed/33498086 http://dx.doi.org/10.1093/bib/bbaa435 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Problem Solving Protocol
Zhang, Qinhu
Wang, Siguo
Chen, Zhanheng
He, Ying
Liu, Qi
Huang, De-Shuang
Locating transcription factor binding sites by fully convolutional neural network
title Locating transcription factor binding sites by fully convolutional neural network
title_full Locating transcription factor binding sites by fully convolutional neural network
title_fullStr Locating transcription factor binding sites by fully convolutional neural network
title_full_unstemmed Locating transcription factor binding sites by fully convolutional neural network
title_short Locating transcription factor binding sites by fully convolutional neural network
title_sort locating transcription factor binding sites by fully convolutional neural network
topic Problem Solving Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8425303/
https://www.ncbi.nlm.nih.gov/pubmed/33498086
http://dx.doi.org/10.1093/bib/bbaa435
work_keys_str_mv AT zhangqinhu locatingtranscriptionfactorbindingsitesbyfullyconvolutionalneuralnetwork
AT wangsiguo locatingtranscriptionfactorbindingsitesbyfullyconvolutionalneuralnetwork
AT chenzhanheng locatingtranscriptionfactorbindingsitesbyfullyconvolutionalneuralnetwork
AT heying locatingtranscriptionfactorbindingsitesbyfullyconvolutionalneuralnetwork
AT liuqi locatingtranscriptionfactorbindingsitesbyfullyconvolutionalneuralnetwork
AT huangdeshuang locatingtranscriptionfactorbindingsitesbyfullyconvolutionalneuralnetwork