Cargando…

DREME: motif discovery in transcription factor ChIP-seq data

Motivation: Transcription factor (TF) ChIP-seq datasets have particular characteristics that provide unique challenges and opportunities for motif discovery. Most existing motif discovery algorithms do not scale well to such large datasets, or fail to report many motifs associated with cofactors of...

Descripción completa

Detalles Bibliográficos
Autor principal: Bailey, Timothy L.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3106199/
https://www.ncbi.nlm.nih.gov/pubmed/21543442
http://dx.doi.org/10.1093/bioinformatics/btr261
_version_ 1782204763048247296
author Bailey, Timothy L.
author_facet Bailey, Timothy L.
author_sort Bailey, Timothy L.
collection PubMed
description Motivation: Transcription factor (TF) ChIP-seq datasets have particular characteristics that provide unique challenges and opportunities for motif discovery. Most existing motif discovery algorithms do not scale well to such large datasets, or fail to report many motifs associated with cofactors of the ChIP-ed TF. Results: We present DREME, a motif discovery algorithm specifically designed to find the short, core DNA-binding motifs of eukaryotic TFs, and optimized to analyze very large ChIP-seq datasets in minutes. Using DREME, we discover the binding motifs of the the ChIP-ed TF and many cofactors in mouse ES cell (mESC), mouse erythrocyte and human cell line ChIP-seq datasets. For example, in mESC ChIP-seq data for the TF Esrrb, we discover the binding motifs for eight cofactor TFs important in the maintenance of pluripotency. Several other commonly used algorithms find at most two cofactor motifs in this same dataset. DREME can also perform discriminative motif discovery, and we use this feature to provide evidence that Sox2 and Oct4 do not bind in mES cells as an obligate heterodimer. DREME is much faster than many commonly used algorithms, scales linearly in dataset size, finds multiple, non-redundant motifs and reports a reliable measure of statistical significance for each motif found. DREME is available as part of the MEME Suite of motif-based sequence analysis tools (http://meme.nbcr.net). Contact: t.bailey@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.
format Text
id pubmed-3106199
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-31061992011-06-03 DREME: motif discovery in transcription factor ChIP-seq data Bailey, Timothy L. Bioinformatics Original Papers Motivation: Transcription factor (TF) ChIP-seq datasets have particular characteristics that provide unique challenges and opportunities for motif discovery. Most existing motif discovery algorithms do not scale well to such large datasets, or fail to report many motifs associated with cofactors of the ChIP-ed TF. Results: We present DREME, a motif discovery algorithm specifically designed to find the short, core DNA-binding motifs of eukaryotic TFs, and optimized to analyze very large ChIP-seq datasets in minutes. Using DREME, we discover the binding motifs of the the ChIP-ed TF and many cofactors in mouse ES cell (mESC), mouse erythrocyte and human cell line ChIP-seq datasets. For example, in mESC ChIP-seq data for the TF Esrrb, we discover the binding motifs for eight cofactor TFs important in the maintenance of pluripotency. Several other commonly used algorithms find at most two cofactor motifs in this same dataset. DREME can also perform discriminative motif discovery, and we use this feature to provide evidence that Sox2 and Oct4 do not bind in mES cells as an obligate heterodimer. DREME is much faster than many commonly used algorithms, scales linearly in dataset size, finds multiple, non-redundant motifs and reports a reliable measure of statistical significance for each motif found. DREME is available as part of the MEME Suite of motif-based sequence analysis tools (http://meme.nbcr.net). Contact: t.bailey@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2011-06-15 2011-05-04 /pmc/articles/PMC3106199/ /pubmed/21543442 http://dx.doi.org/10.1093/bioinformatics/btr261 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Bailey, Timothy L.
DREME: motif discovery in transcription factor ChIP-seq data
title DREME: motif discovery in transcription factor ChIP-seq data
title_full DREME: motif discovery in transcription factor ChIP-seq data
title_fullStr DREME: motif discovery in transcription factor ChIP-seq data
title_full_unstemmed DREME: motif discovery in transcription factor ChIP-seq data
title_short DREME: motif discovery in transcription factor ChIP-seq data
title_sort dreme: motif discovery in transcription factor chip-seq data
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3106199/
https://www.ncbi.nlm.nih.gov/pubmed/21543442
http://dx.doi.org/10.1093/bioinformatics/btr261
work_keys_str_mv AT baileytimothyl drememotifdiscoveryintranscriptionfactorchipseqdata