Cargando…

BindSpace decodes transcription factor binding signals by large-scale sequence embedding

Decoding transcription factor (TF) binding signals in genomic DNA is a fundamental problem. Here we present a prediction model called BindSpace that learns to embed DNA sequences and TF class/family labels into the same space. By training on binding data for hundreds of TFs and embedding over 1M DNA...

Descripción completa

Detalles Bibliográficos
Autores principales: Yuan, Han, Kshirsagar, Meghana, Zamparo, Lee, Lu, Yuheng, Leslie, Christina S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6717532/
https://www.ncbi.nlm.nih.gov/pubmed/31406384
http://dx.doi.org/10.1038/s41592-019-0511-y
Descripción
Sumario:Decoding transcription factor (TF) binding signals in genomic DNA is a fundamental problem. Here we present a prediction model called BindSpace that learns to embed DNA sequences and TF class/family labels into the same space. By training on binding data for hundreds of TFs and embedding over 1M DNA sequences, BindSpace achieves state-of-the-art multiclass binding prediction performance, in vitro and in vivo, and can distinguish signals of closely related TFs.