Cargando…

HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency

MOTIVATION: Transcription factor (TF) binds to conservative DNA binding sites in different cellular environments and development stages by physical interaction with interdependent nucleotides. However, systematic computational characterization of the relationship between higher-order nucleotide depe...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Zixuan, Xiong, Shuwen, Yu, Yun, Zhou, Jiliu, Zhang, Yongqing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10191609/
https://www.ncbi.nlm.nih.gov/pubmed/37140548
http://dx.doi.org/10.1093/bioinformatics/btad299
_version_ 1785043499438047232
author Wang, Zixuan
Xiong, Shuwen
Yu, Yun
Zhou, Jiliu
Zhang, Yongqing
author_facet Wang, Zixuan
Xiong, Shuwen
Yu, Yun
Zhou, Jiliu
Zhang, Yongqing
author_sort Wang, Zixuan
collection PubMed
description MOTIVATION: Transcription factor (TF) binds to conservative DNA binding sites in different cellular environments and development stages by physical interaction with interdependent nucleotides. However, systematic computational characterization of the relationship between higher-order nucleotide dependency and TF-DNA binding mechanism in diverse cell types remains challenging. RESULTS: Here, we propose a novel multi-task learning framework HAMPLE to simultaneously predict TF binding sites (TFBS) in distinct cell types by characterizing higher-order nucleotide dependencies. Specifically, HAMPLE first represents a DNA sequence through three higher-order nucleotide dependencies, including k-mer encoding, DNA shape and histone modification. Then, HAMPLE uses the customized gate control and the channel attention convolutional architecture to further capture cell-type-specific and cell-type-shared DNA binding motifs and epigenomic languages. Finally, HAMPLE exploits the joint loss function to optimize the TFBS prediction for different cell types in an end-to-end manner. Extensive experimental results on seven datasets demonstrate that HAMPLE significantly outperforms the state-of-the-art approaches in terms of auROC. In addition, feature importance analysis illustrates that k-mer encoding, DNA shape, and histone modification have predictive power for TF-DNA binding in different cellular environments and are complementary to each other. Furthermore, ablation study, and interpretable analysis validate the effectiveness of the customized gate control and the channel attention convolutional architecture in characterizing higher-order nucleotide dependencies. AVAILABILITY AND IMPLEMENTATION: The source code is available at https://github.com/ZhangLab312/Hample.
format Online
Article
Text
id pubmed-10191609
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-101916092023-05-18 HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency Wang, Zixuan Xiong, Shuwen Yu, Yun Zhou, Jiliu Zhang, Yongqing Bioinformatics Original Paper MOTIVATION: Transcription factor (TF) binds to conservative DNA binding sites in different cellular environments and development stages by physical interaction with interdependent nucleotides. However, systematic computational characterization of the relationship between higher-order nucleotide dependency and TF-DNA binding mechanism in diverse cell types remains challenging. RESULTS: Here, we propose a novel multi-task learning framework HAMPLE to simultaneously predict TF binding sites (TFBS) in distinct cell types by characterizing higher-order nucleotide dependencies. Specifically, HAMPLE first represents a DNA sequence through three higher-order nucleotide dependencies, including k-mer encoding, DNA shape and histone modification. Then, HAMPLE uses the customized gate control and the channel attention convolutional architecture to further capture cell-type-specific and cell-type-shared DNA binding motifs and epigenomic languages. Finally, HAMPLE exploits the joint loss function to optimize the TFBS prediction for different cell types in an end-to-end manner. Extensive experimental results on seven datasets demonstrate that HAMPLE significantly outperforms the state-of-the-art approaches in terms of auROC. In addition, feature importance analysis illustrates that k-mer encoding, DNA shape, and histone modification have predictive power for TF-DNA binding in different cellular environments and are complementary to each other. Furthermore, ablation study, and interpretable analysis validate the effectiveness of the customized gate control and the channel attention convolutional architecture in characterizing higher-order nucleotide dependencies. AVAILABILITY AND IMPLEMENTATION: The source code is available at https://github.com/ZhangLab312/Hample. Oxford University Press 2023-05-04 /pmc/articles/PMC10191609/ /pubmed/37140548 http://dx.doi.org/10.1093/bioinformatics/btad299 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Wang, Zixuan
Xiong, Shuwen
Yu, Yun
Zhou, Jiliu
Zhang, Yongqing
HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency
title HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency
title_full HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency
title_fullStr HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency
title_full_unstemmed HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency
title_short HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency
title_sort hample: deciphering tf-dna binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10191609/
https://www.ncbi.nlm.nih.gov/pubmed/37140548
http://dx.doi.org/10.1093/bioinformatics/btad299
work_keys_str_mv AT wangzixuan hampledecipheringtfdnabindingmechanismindifferentcellularenvironmentsbycharacterizinghigherordernucleotidedependency
AT xiongshuwen hampledecipheringtfdnabindingmechanismindifferentcellularenvironmentsbycharacterizinghigherordernucleotidedependency
AT yuyun hampledecipheringtfdnabindingmechanismindifferentcellularenvironmentsbycharacterizinghigherordernucleotidedependency
AT zhoujiliu hampledecipheringtfdnabindingmechanismindifferentcellularenvironmentsbycharacterizinghigherordernucleotidedependency
AT zhangyongqing hampledecipheringtfdnabindingmechanismindifferentcellularenvironmentsbycharacterizinghigherordernucleotidedependency