Cargando…

Crunch: integrated processing and modeling of ChIP-seq data in terms of regulatory motifs

Although ChIP-seq has become a routine experimental approach for quantitatively characterizing the genome-wide binding of transcription factors (TFs), computational analysis procedures remain far from standardized, making it difficult to compare ChIP-seq results across experiments. In addition, alth...

Descripción completa

Detalles Bibliográficos
Autores principales: Berger, Severin, Pachkov, Mikhail, Arnold, Phil, Omidi, Saeed, Kelley, Nicholas, Salatino, Silvia, van Nimwegen, Erik
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6633267/
https://www.ncbi.nlm.nih.gov/pubmed/31138617
http://dx.doi.org/10.1101/gr.239319.118
_version_ 1783435719161675776
author Berger, Severin
Pachkov, Mikhail
Arnold, Phil
Omidi, Saeed
Kelley, Nicholas
Salatino, Silvia
van Nimwegen, Erik
author_facet Berger, Severin
Pachkov, Mikhail
Arnold, Phil
Omidi, Saeed
Kelley, Nicholas
Salatino, Silvia
van Nimwegen, Erik
author_sort Berger, Severin
collection PubMed
description Although ChIP-seq has become a routine experimental approach for quantitatively characterizing the genome-wide binding of transcription factors (TFs), computational analysis procedures remain far from standardized, making it difficult to compare ChIP-seq results across experiments. In addition, although genome-wide binding patterns must ultimately be determined by local constellations of DNA-binding sites, current analysis is typically limited to identifying enriched motifs in ChIP-seq peaks. Here we present Crunch, a completely automated computational method that performs all ChIP-seq analysis from quality control through read mapping and peak detecting and that integrates comprehensive modeling of the ChIP signal in terms of known and novel binding motifs, quantifying the contribution of each motif and annotating which combinations of motifs explain each binding peak. By applying Crunch to 128 data sets from the ENCODE Project, we show that Crunch outperforms current peak finders and find that TFs naturally separate into “solitary TFs,” for which a single motif explains the ChIP-peaks, and “cobinding TFs,” for which multiple motifs co-occur within peaks. Moreover, for most data sets, the motifs that Crunch identified de novo outperform known motifs, and both the set of cobinding motifs and the top motif of solitary TFs are consistent across experiments and cell lines. Crunch is implemented as a web server, enabling standardized analysis of any collection of ChIP-seq data sets by simply uploading raw sequencing data. Results are provided both in a graphical web interface and as downloadable files.
format Online
Article
Text
id pubmed-6633267
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-66332672019-07-30 Crunch: integrated processing and modeling of ChIP-seq data in terms of regulatory motifs Berger, Severin Pachkov, Mikhail Arnold, Phil Omidi, Saeed Kelley, Nicholas Salatino, Silvia van Nimwegen, Erik Genome Res Method Although ChIP-seq has become a routine experimental approach for quantitatively characterizing the genome-wide binding of transcription factors (TFs), computational analysis procedures remain far from standardized, making it difficult to compare ChIP-seq results across experiments. In addition, although genome-wide binding patterns must ultimately be determined by local constellations of DNA-binding sites, current analysis is typically limited to identifying enriched motifs in ChIP-seq peaks. Here we present Crunch, a completely automated computational method that performs all ChIP-seq analysis from quality control through read mapping and peak detecting and that integrates comprehensive modeling of the ChIP signal in terms of known and novel binding motifs, quantifying the contribution of each motif and annotating which combinations of motifs explain each binding peak. By applying Crunch to 128 data sets from the ENCODE Project, we show that Crunch outperforms current peak finders and find that TFs naturally separate into “solitary TFs,” for which a single motif explains the ChIP-peaks, and “cobinding TFs,” for which multiple motifs co-occur within peaks. Moreover, for most data sets, the motifs that Crunch identified de novo outperform known motifs, and both the set of cobinding motifs and the top motif of solitary TFs are consistent across experiments and cell lines. Crunch is implemented as a web server, enabling standardized analysis of any collection of ChIP-seq data sets by simply uploading raw sequencing data. Results are provided both in a graphical web interface and as downloadable files. Cold Spring Harbor Laboratory Press 2019-07 /pmc/articles/PMC6633267/ /pubmed/31138617 http://dx.doi.org/10.1101/gr.239319.118 Text en © 2019 Berger et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/4.0/ This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle Method
Berger, Severin
Pachkov, Mikhail
Arnold, Phil
Omidi, Saeed
Kelley, Nicholas
Salatino, Silvia
van Nimwegen, Erik
Crunch: integrated processing and modeling of ChIP-seq data in terms of regulatory motifs
title Crunch: integrated processing and modeling of ChIP-seq data in terms of regulatory motifs
title_full Crunch: integrated processing and modeling of ChIP-seq data in terms of regulatory motifs
title_fullStr Crunch: integrated processing and modeling of ChIP-seq data in terms of regulatory motifs
title_full_unstemmed Crunch: integrated processing and modeling of ChIP-seq data in terms of regulatory motifs
title_short Crunch: integrated processing and modeling of ChIP-seq data in terms of regulatory motifs
title_sort crunch: integrated processing and modeling of chip-seq data in terms of regulatory motifs
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6633267/
https://www.ncbi.nlm.nih.gov/pubmed/31138617
http://dx.doi.org/10.1101/gr.239319.118
work_keys_str_mv AT bergerseverin crunchintegratedprocessingandmodelingofchipseqdataintermsofregulatorymotifs
AT pachkovmikhail crunchintegratedprocessingandmodelingofchipseqdataintermsofregulatorymotifs
AT arnoldphil crunchintegratedprocessingandmodelingofchipseqdataintermsofregulatorymotifs
AT omidisaeed crunchintegratedprocessingandmodelingofchipseqdataintermsofregulatorymotifs
AT kelleynicholas crunchintegratedprocessingandmodelingofchipseqdataintermsofregulatorymotifs
AT salatinosilvia crunchintegratedprocessingandmodelingofchipseqdataintermsofregulatorymotifs
AT vannimwegenerik crunchintegratedprocessingandmodelingofchipseqdataintermsofregulatorymotifs