Cargando…

BindSpace decodes transcription factor binding signals by large-scale sequence embedding

Decoding transcription factor (TF) binding signals in genomic DNA is a fundamental problem. Here we present a prediction model called BindSpace that learns to embed DNA sequences and TF class/family labels into the same space. By training on binding data for hundreds of TFs and embedding over 1M DNA...

Descripción completa

Detalles Bibliográficos
Autores principales: Yuan, Han, Kshirsagar, Meghana, Zamparo, Lee, Lu, Yuheng, Leslie, Christina S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6717532/
https://www.ncbi.nlm.nih.gov/pubmed/31406384
http://dx.doi.org/10.1038/s41592-019-0511-y
_version_ 1783447575386390528
author Yuan, Han
Kshirsagar, Meghana
Zamparo, Lee
Lu, Yuheng
Leslie, Christina S.
author_facet Yuan, Han
Kshirsagar, Meghana
Zamparo, Lee
Lu, Yuheng
Leslie, Christina S.
author_sort Yuan, Han
collection PubMed
description Decoding transcription factor (TF) binding signals in genomic DNA is a fundamental problem. Here we present a prediction model called BindSpace that learns to embed DNA sequences and TF class/family labels into the same space. By training on binding data for hundreds of TFs and embedding over 1M DNA sequences, BindSpace achieves state-of-the-art multiclass binding prediction performance, in vitro and in vivo, and can distinguish signals of closely related TFs.
format Online
Article
Text
id pubmed-6717532
institution National Center for Biotechnology Information
language English
publishDate 2019
record_format MEDLINE/PubMed
spelling pubmed-67175322020-02-12 BindSpace decodes transcription factor binding signals by large-scale sequence embedding Yuan, Han Kshirsagar, Meghana Zamparo, Lee Lu, Yuheng Leslie, Christina S. Nat Methods Article Decoding transcription factor (TF) binding signals in genomic DNA is a fundamental problem. Here we present a prediction model called BindSpace that learns to embed DNA sequences and TF class/family labels into the same space. By training on binding data for hundreds of TFs and embedding over 1M DNA sequences, BindSpace achieves state-of-the-art multiclass binding prediction performance, in vitro and in vivo, and can distinguish signals of closely related TFs. 2019-08-12 2019-09 /pmc/articles/PMC6717532/ /pubmed/31406384 http://dx.doi.org/10.1038/s41592-019-0511-y Text en Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:http://www.nature.com/authors/editorial_policies/license.html#terms
spellingShingle Article
Yuan, Han
Kshirsagar, Meghana
Zamparo, Lee
Lu, Yuheng
Leslie, Christina S.
BindSpace decodes transcription factor binding signals by large-scale sequence embedding
title BindSpace decodes transcription factor binding signals by large-scale sequence embedding
title_full BindSpace decodes transcription factor binding signals by large-scale sequence embedding
title_fullStr BindSpace decodes transcription factor binding signals by large-scale sequence embedding
title_full_unstemmed BindSpace decodes transcription factor binding signals by large-scale sequence embedding
title_short BindSpace decodes transcription factor binding signals by large-scale sequence embedding
title_sort bindspace decodes transcription factor binding signals by large-scale sequence embedding
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6717532/
https://www.ncbi.nlm.nih.gov/pubmed/31406384
http://dx.doi.org/10.1038/s41592-019-0511-y
work_keys_str_mv AT yuanhan bindspacedecodestranscriptionfactorbindingsignalsbylargescalesequenceembedding
AT kshirsagarmeghana bindspacedecodestranscriptionfactorbindingsignalsbylargescalesequenceembedding
AT zamparolee bindspacedecodestranscriptionfactorbindingsignalsbylargescalesequenceembedding
AT luyuheng bindspacedecodestranscriptionfactorbindingsignalsbylargescalesequenceembedding
AT lesliechristinas bindspacedecodestranscriptionfactorbindingsignalsbylargescalesequenceembedding