Cargando…

Non-targeted transcription factors motifs are a systemic component of ChIP-seq datasets

BACKGROUND: The global effort to annotate the non-coding portion of the human genome relies heavily on chromatin immunoprecipitation data generated with high-throughput DNA sequencing (ChIP-seq). ChIP-seq is generally successful in detailing the segments of the genome bound by the immunoprecipitated...

Descripción completa

Detalles Bibliográficos
Autores principales: Worsley Hunt, Rebecca, Wasserman, Wyeth W
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4165360/
https://www.ncbi.nlm.nih.gov/pubmed/25070602
http://dx.doi.org/10.1186/s13059-014-0412-4
_version_ 1782335092129005568
author Worsley Hunt, Rebecca
Wasserman, Wyeth W
author_facet Worsley Hunt, Rebecca
Wasserman, Wyeth W
author_sort Worsley Hunt, Rebecca
collection PubMed
description BACKGROUND: The global effort to annotate the non-coding portion of the human genome relies heavily on chromatin immunoprecipitation data generated with high-throughput DNA sequencing (ChIP-seq). ChIP-seq is generally successful in detailing the segments of the genome bound by the immunoprecipitated transcription factor (TF), however almost all datasets contain genomic regions devoid of the canonical motif for the TF. It remains to be determined if these regions are related to the immunoprecipitated TF or whether, despite the use of controls, there is a portion of peaks that can be attributed to other causes. RESULTS: Analyses across hundreds of ChIP-seq datasets generated for sequence-specific DNA binding TFs reveal a small set of TF binding profiles for which predicted TF binding site motifs are repeatedly observed to be significantly enriched. Grouping related binding profiles, the set includes: CTCF-like, ETS-like, JUN-like, and THAP11 profiles. These frequently enriched profiles are termed ‘zingers’ to highlight their unanticipated enrichment in datasets for which they were not the targeted TF, and their potential impact on the interpretation and analysis of TF ChIP-seq data. Peaks with zinger motifs and lacking the ChIPped TF’s motif are observed to compose up to 45% of a ChIP-seq dataset. There is substantial overlap of zinger motif containing regions between diverse TF datasets, suggesting a mechanism that is not TF-specific for the recovery of these regions. CONCLUSIONS: Based on the zinger regions proximity to cohesin-bound segments, a loading station model is proposed. Further study of zingers will advance understanding of gene regulation. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-014-0412-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4165360
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41653602014-09-17 Non-targeted transcription factors motifs are a systemic component of ChIP-seq datasets Worsley Hunt, Rebecca Wasserman, Wyeth W Genome Biol Research BACKGROUND: The global effort to annotate the non-coding portion of the human genome relies heavily on chromatin immunoprecipitation data generated with high-throughput DNA sequencing (ChIP-seq). ChIP-seq is generally successful in detailing the segments of the genome bound by the immunoprecipitated transcription factor (TF), however almost all datasets contain genomic regions devoid of the canonical motif for the TF. It remains to be determined if these regions are related to the immunoprecipitated TF or whether, despite the use of controls, there is a portion of peaks that can be attributed to other causes. RESULTS: Analyses across hundreds of ChIP-seq datasets generated for sequence-specific DNA binding TFs reveal a small set of TF binding profiles for which predicted TF binding site motifs are repeatedly observed to be significantly enriched. Grouping related binding profiles, the set includes: CTCF-like, ETS-like, JUN-like, and THAP11 profiles. These frequently enriched profiles are termed ‘zingers’ to highlight their unanticipated enrichment in datasets for which they were not the targeted TF, and their potential impact on the interpretation and analysis of TF ChIP-seq data. Peaks with zinger motifs and lacking the ChIPped TF’s motif are observed to compose up to 45% of a ChIP-seq dataset. There is substantial overlap of zinger motif containing regions between diverse TF datasets, suggesting a mechanism that is not TF-specific for the recovery of these regions. CONCLUSIONS: Based on the zinger regions proximity to cohesin-bound segments, a loading station model is proposed. Further study of zingers will advance understanding of gene regulation. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-014-0412-4) contains supplementary material, which is available to authorized users. BioMed Central 2014-07-29 2014 /pmc/articles/PMC4165360/ /pubmed/25070602 http://dx.doi.org/10.1186/s13059-014-0412-4 Text en © Worsley Hunt and Wasserman; licensee BioMed Central 2014 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Worsley Hunt, Rebecca
Wasserman, Wyeth W
Non-targeted transcription factors motifs are a systemic component of ChIP-seq datasets
title Non-targeted transcription factors motifs are a systemic component of ChIP-seq datasets
title_full Non-targeted transcription factors motifs are a systemic component of ChIP-seq datasets
title_fullStr Non-targeted transcription factors motifs are a systemic component of ChIP-seq datasets
title_full_unstemmed Non-targeted transcription factors motifs are a systemic component of ChIP-seq datasets
title_short Non-targeted transcription factors motifs are a systemic component of ChIP-seq datasets
title_sort non-targeted transcription factors motifs are a systemic component of chip-seq datasets
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4165360/
https://www.ncbi.nlm.nih.gov/pubmed/25070602
http://dx.doi.org/10.1186/s13059-014-0412-4
work_keys_str_mv AT worsleyhuntrebecca nontargetedtranscriptionfactorsmotifsareasystemiccomponentofchipseqdatasets
AT wassermanwyethw nontargetedtranscriptionfactorsmotifsareasystemiccomponentofchipseqdatasets