Cargando…

RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets

ChIP-seq is increasingly used to characterize transcription factor binding and chromatin marks at a genomic scale. Various tools are now available to extract binding motifs from peak data sets. However, most approaches are only available as command-line programs, or via a website but with size restr...

Descripción completa

Detalles Bibliográficos
Autores principales: Thomas-Chollier, Morgane, Herrmann, Carl, Defrance, Matthieu, Sand, Olivier, Thieffry, Denis, van Helden, Jacques
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287167/
https://www.ncbi.nlm.nih.gov/pubmed/22156162
http://dx.doi.org/10.1093/nar/gkr1104
_version_ 1782224626531696640
author Thomas-Chollier, Morgane
Herrmann, Carl
Defrance, Matthieu
Sand, Olivier
Thieffry, Denis
van Helden, Jacques
author_facet Thomas-Chollier, Morgane
Herrmann, Carl
Defrance, Matthieu
Sand, Olivier
Thieffry, Denis
van Helden, Jacques
author_sort Thomas-Chollier, Morgane
collection PubMed
description ChIP-seq is increasingly used to characterize transcription factor binding and chromatin marks at a genomic scale. Various tools are now available to extract binding motifs from peak data sets. However, most approaches are only available as command-line programs, or via a website but with size restrictions. We present peak-motifs, a computational pipeline that discovers motifs in peak sequences, compares them with databases, exports putative binding sites for visualization in the UCSC genome browser and generates an extensive report suited for both naive and expert users. It relies on time- and memory-efficient algorithms enabling the treatment of several thousand peaks within minutes. Regarding time efficiency, peak-motifs outperforms all comparable tools by several orders of magnitude. We demonstrate its accuracy by analyzing data sets ranging from 4000 to 1 28 000 peaks for 12 embryonic stem cell-specific transcription factors. In all cases, the program finds the expected motifs and returns additional motifs potentially bound by cofactors. We further apply peak-motifs to discover tissue-specific motifs in peak collections for the p300 transcriptional co-activator. To our knowledge, peak-motifs is the only tool that performs a complete motif analysis and offers a user-friendly web interface without any restriction on sequence size or number of peaks.
format Online
Article
Text
id pubmed-3287167
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-32871672012-02-27 RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets Thomas-Chollier, Morgane Herrmann, Carl Defrance, Matthieu Sand, Olivier Thieffry, Denis van Helden, Jacques Nucleic Acids Res Methods Online ChIP-seq is increasingly used to characterize transcription factor binding and chromatin marks at a genomic scale. Various tools are now available to extract binding motifs from peak data sets. However, most approaches are only available as command-line programs, or via a website but with size restrictions. We present peak-motifs, a computational pipeline that discovers motifs in peak sequences, compares them with databases, exports putative binding sites for visualization in the UCSC genome browser and generates an extensive report suited for both naive and expert users. It relies on time- and memory-efficient algorithms enabling the treatment of several thousand peaks within minutes. Regarding time efficiency, peak-motifs outperforms all comparable tools by several orders of magnitude. We demonstrate its accuracy by analyzing data sets ranging from 4000 to 1 28 000 peaks for 12 embryonic stem cell-specific transcription factors. In all cases, the program finds the expected motifs and returns additional motifs potentially bound by cofactors. We further apply peak-motifs to discover tissue-specific motifs in peak collections for the p300 transcriptional co-activator. To our knowledge, peak-motifs is the only tool that performs a complete motif analysis and offers a user-friendly web interface without any restriction on sequence size or number of peaks. Oxford University Press 2012-02 2011-12-08 /pmc/articles/PMC3287167/ /pubmed/22156162 http://dx.doi.org/10.1093/nar/gkr1104 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Thomas-Chollier, Morgane
Herrmann, Carl
Defrance, Matthieu
Sand, Olivier
Thieffry, Denis
van Helden, Jacques
RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets
title RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets
title_full RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets
title_fullStr RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets
title_full_unstemmed RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets
title_short RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets
title_sort rsat peak-motifs: motif analysis in full-size chip-seq datasets
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287167/
https://www.ncbi.nlm.nih.gov/pubmed/22156162
http://dx.doi.org/10.1093/nar/gkr1104
work_keys_str_mv AT thomascholliermorgane rsatpeakmotifsmotifanalysisinfullsizechipseqdatasets
AT herrmanncarl rsatpeakmotifsmotifanalysisinfullsizechipseqdatasets
AT defrancematthieu rsatpeakmotifsmotifanalysisinfullsizechipseqdatasets
AT sandolivier rsatpeakmotifsmotifanalysisinfullsizechipseqdatasets
AT thieffrydenis rsatpeakmotifsmotifanalysisinfullsizechipseqdatasets
AT vanheldenjacques rsatpeakmotifsmotifanalysisinfullsizechipseqdatasets