Cargando…

A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data

Understanding gene regulation is a key challenge in today's biology. The new technologies of protein-binding microarrays (PBMs) and high-throughput SELEX (HT-SELEX) allow measurement of the binding intensities of one transcription factor (TF) to numerous synthetic double-stranded DNA sequences...

Descripción completa

Detalles Bibliográficos
Autores principales: Orenstein, Yaron, Shamir, Ron
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4005680/
https://www.ncbi.nlm.nih.gov/pubmed/24500199
http://dx.doi.org/10.1093/nar/gku117
_version_ 1782314140377808896
author Orenstein, Yaron
Shamir, Ron
author_facet Orenstein, Yaron
Shamir, Ron
author_sort Orenstein, Yaron
collection PubMed
description Understanding gene regulation is a key challenge in today's biology. The new technologies of protein-binding microarrays (PBMs) and high-throughput SELEX (HT-SELEX) allow measurement of the binding intensities of one transcription factor (TF) to numerous synthetic double-stranded DNA sequences in a single experiment. Recently, Jolma et al. reported the results of 547 HT-SELEX experiments covering human and mouse TFs. Because 162 of these TFs were also covered by PBM technology, for the first time, a large-scale comparison between implementations of these two in vitro technologies is possible. Here we assessed the similarities and differences between binding models, represented as position weight matrices, inferred from PBM and HT-SELEX, and also measured how well these models predict in vivo binding. Our results show that HT-SELEX- and PBM-derived models agree for most TFs. For some TFs, the HT-SELEX-derived models are longer versions of the PBM-derived models, whereas for other TFs, the HT-SELEX models match the secondary PBM-derived models. Remarkably, PBM-based 8-mer ranking is more accurate than that of HT-SELEX, but models derived from HT-SELEX predict in vivo binding better. In addition, we reveal several biases in HT-SELEX data including nucleotide frequency bias, enrichment of C-rich k-mers and oligos and underrepresentation of palindromes.
format Online
Article
Text
id pubmed-4005680
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-40056802014-05-01 A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data Orenstein, Yaron Shamir, Ron Nucleic Acids Res Methods Online Understanding gene regulation is a key challenge in today's biology. The new technologies of protein-binding microarrays (PBMs) and high-throughput SELEX (HT-SELEX) allow measurement of the binding intensities of one transcription factor (TF) to numerous synthetic double-stranded DNA sequences in a single experiment. Recently, Jolma et al. reported the results of 547 HT-SELEX experiments covering human and mouse TFs. Because 162 of these TFs were also covered by PBM technology, for the first time, a large-scale comparison between implementations of these two in vitro technologies is possible. Here we assessed the similarities and differences between binding models, represented as position weight matrices, inferred from PBM and HT-SELEX, and also measured how well these models predict in vivo binding. Our results show that HT-SELEX- and PBM-derived models agree for most TFs. For some TFs, the HT-SELEX-derived models are longer versions of the PBM-derived models, whereas for other TFs, the HT-SELEX models match the secondary PBM-derived models. Remarkably, PBM-based 8-mer ranking is more accurate than that of HT-SELEX, but models derived from HT-SELEX predict in vivo binding better. In addition, we reveal several biases in HT-SELEX data including nucleotide frequency bias, enrichment of C-rich k-mers and oligos and underrepresentation of palindromes. Oxford University Press 2014-04 2014-02-05 /pmc/articles/PMC4005680/ /pubmed/24500199 http://dx.doi.org/10.1093/nar/gku117 Text en © The Author(s) 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Orenstein, Yaron
Shamir, Ron
A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data
title A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data
title_full A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data
title_fullStr A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data
title_full_unstemmed A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data
title_short A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data
title_sort comparative analysis of transcription factor binding models learned from pbm, ht-selex and chip data
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4005680/
https://www.ncbi.nlm.nih.gov/pubmed/24500199
http://dx.doi.org/10.1093/nar/gku117
work_keys_str_mv AT orensteinyaron acomparativeanalysisoftranscriptionfactorbindingmodelslearnedfrompbmhtselexandchipdata
AT shamirron acomparativeanalysisoftranscriptionfactorbindingmodelslearnedfrompbmhtselexandchipdata
AT orensteinyaron comparativeanalysisoftranscriptionfactorbindingmodelslearnedfrompbmhtselexandchipdata
AT shamirron comparativeanalysisoftranscriptionfactorbindingmodelslearnedfrompbmhtselexandchipdata