Cargando…
HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency
MOTIVATION: Transcription factor (TF) binds to conservative DNA binding sites in different cellular environments and development stages by physical interaction with interdependent nucleotides. However, systematic computational characterization of the relationship between higher-order nucleotide depe...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10191609/ https://www.ncbi.nlm.nih.gov/pubmed/37140548 http://dx.doi.org/10.1093/bioinformatics/btad299 |
_version_ | 1785043499438047232 |
---|---|
author | Wang, Zixuan Xiong, Shuwen Yu, Yun Zhou, Jiliu Zhang, Yongqing |
author_facet | Wang, Zixuan Xiong, Shuwen Yu, Yun Zhou, Jiliu Zhang, Yongqing |
author_sort | Wang, Zixuan |
collection | PubMed |
description | MOTIVATION: Transcription factor (TF) binds to conservative DNA binding sites in different cellular environments and development stages by physical interaction with interdependent nucleotides. However, systematic computational characterization of the relationship between higher-order nucleotide dependency and TF-DNA binding mechanism in diverse cell types remains challenging. RESULTS: Here, we propose a novel multi-task learning framework HAMPLE to simultaneously predict TF binding sites (TFBS) in distinct cell types by characterizing higher-order nucleotide dependencies. Specifically, HAMPLE first represents a DNA sequence through three higher-order nucleotide dependencies, including k-mer encoding, DNA shape and histone modification. Then, HAMPLE uses the customized gate control and the channel attention convolutional architecture to further capture cell-type-specific and cell-type-shared DNA binding motifs and epigenomic languages. Finally, HAMPLE exploits the joint loss function to optimize the TFBS prediction for different cell types in an end-to-end manner. Extensive experimental results on seven datasets demonstrate that HAMPLE significantly outperforms the state-of-the-art approaches in terms of auROC. In addition, feature importance analysis illustrates that k-mer encoding, DNA shape, and histone modification have predictive power for TF-DNA binding in different cellular environments and are complementary to each other. Furthermore, ablation study, and interpretable analysis validate the effectiveness of the customized gate control and the channel attention convolutional architecture in characterizing higher-order nucleotide dependencies. AVAILABILITY AND IMPLEMENTATION: The source code is available at https://github.com/ZhangLab312/Hample. |
format | Online Article Text |
id | pubmed-10191609 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-101916092023-05-18 HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency Wang, Zixuan Xiong, Shuwen Yu, Yun Zhou, Jiliu Zhang, Yongqing Bioinformatics Original Paper MOTIVATION: Transcription factor (TF) binds to conservative DNA binding sites in different cellular environments and development stages by physical interaction with interdependent nucleotides. However, systematic computational characterization of the relationship between higher-order nucleotide dependency and TF-DNA binding mechanism in diverse cell types remains challenging. RESULTS: Here, we propose a novel multi-task learning framework HAMPLE to simultaneously predict TF binding sites (TFBS) in distinct cell types by characterizing higher-order nucleotide dependencies. Specifically, HAMPLE first represents a DNA sequence through three higher-order nucleotide dependencies, including k-mer encoding, DNA shape and histone modification. Then, HAMPLE uses the customized gate control and the channel attention convolutional architecture to further capture cell-type-specific and cell-type-shared DNA binding motifs and epigenomic languages. Finally, HAMPLE exploits the joint loss function to optimize the TFBS prediction for different cell types in an end-to-end manner. Extensive experimental results on seven datasets demonstrate that HAMPLE significantly outperforms the state-of-the-art approaches in terms of auROC. In addition, feature importance analysis illustrates that k-mer encoding, DNA shape, and histone modification have predictive power for TF-DNA binding in different cellular environments and are complementary to each other. Furthermore, ablation study, and interpretable analysis validate the effectiveness of the customized gate control and the channel attention convolutional architecture in characterizing higher-order nucleotide dependencies. AVAILABILITY AND IMPLEMENTATION: The source code is available at https://github.com/ZhangLab312/Hample. Oxford University Press 2023-05-04 /pmc/articles/PMC10191609/ /pubmed/37140548 http://dx.doi.org/10.1093/bioinformatics/btad299 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Paper Wang, Zixuan Xiong, Shuwen Yu, Yun Zhou, Jiliu Zhang, Yongqing HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency |
title | HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency |
title_full | HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency |
title_fullStr | HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency |
title_full_unstemmed | HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency |
title_short | HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency |
title_sort | hample: deciphering tf-dna binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10191609/ https://www.ncbi.nlm.nih.gov/pubmed/37140548 http://dx.doi.org/10.1093/bioinformatics/btad299 |
work_keys_str_mv | AT wangzixuan hampledecipheringtfdnabindingmechanismindifferentcellularenvironmentsbycharacterizinghigherordernucleotidedependency AT xiongshuwen hampledecipheringtfdnabindingmechanismindifferentcellularenvironmentsbycharacterizinghigherordernucleotidedependency AT yuyun hampledecipheringtfdnabindingmechanismindifferentcellularenvironmentsbycharacterizinghigherordernucleotidedependency AT zhoujiliu hampledecipheringtfdnabindingmechanismindifferentcellularenvironmentsbycharacterizinghigherordernucleotidedependency AT zhangyongqing hampledecipheringtfdnabindingmechanismindifferentcellularenvironmentsbycharacterizinghigherordernucleotidedependency |