Cargando…

Decoding ChIP-seq with a double-binding signal refines binding peaks to single-nucleotides and predicts cooperative interaction

The comprehension of protein and DNA binding in vivo is essential to understand gene regulation. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) provides a global map of the regulatory binding network. Most ChIP-seq analysis tools focus on identifying binding regions from coverage en...

Descripción completa

Detalles Bibliográficos
Autores principales: Gomes, Antonio L.C., Abeel, Thomas, Peterson, Matthew, Azizi, Elham, Lyubetskaya, Anna, Carvalho, Luís, Galagan, James
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4199365/
https://www.ncbi.nlm.nih.gov/pubmed/25024162
http://dx.doi.org/10.1101/gr.161711.113
_version_ 1782339896790220800
author Gomes, Antonio L.C.
Abeel, Thomas
Peterson, Matthew
Azizi, Elham
Lyubetskaya, Anna
Carvalho, Luís
Galagan, James
author_facet Gomes, Antonio L.C.
Abeel, Thomas
Peterson, Matthew
Azizi, Elham
Lyubetskaya, Anna
Carvalho, Luís
Galagan, James
author_sort Gomes, Antonio L.C.
collection PubMed
description The comprehension of protein and DNA binding in vivo is essential to understand gene regulation. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) provides a global map of the regulatory binding network. Most ChIP-seq analysis tools focus on identifying binding regions from coverage enrichment. However, less work has been performed to infer the physical and regulatory details inside the enriched regions. This research extends a previous blind-deconvolution approach to develop a post-peak–calling algorithm that improves binding site resolution and predicts cooperative interactions. At the core of our new method is a physically motivated model that characterizes the binding signal as an extreme value distribution. This model suggests a mathematical framework to study physical properties of DNA shearing from the ChIP-seq coverage. The model explains the ChIP-seq coverage with two signals: The first considers DNA fragments with only a single binding event, whereas the second considers fragments with two binding events (a double-binding signal). The model incorporates motif discovery and is able to detect multiple sites in an enriched region with single-nucleotide resolution, high sensitivity, and high specificity. Our method improves peak caller sensitivity, from less than 45% up to 94%, at a false positive rate <11% for a set of 47 experimentally validated prokaryotic sites. It also improves resolution of highly enriched regions of large-scale eukaryotic data sets. The double-binding signal provides a novel application in ChIP-seq analysis: the identification of cooperative interaction. Predictions of known cooperative binding sites show a 0.85 area under an ROC curve.
format Online
Article
Text
id pubmed-4199365
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-41993652015-04-01 Decoding ChIP-seq with a double-binding signal refines binding peaks to single-nucleotides and predicts cooperative interaction Gomes, Antonio L.C. Abeel, Thomas Peterson, Matthew Azizi, Elham Lyubetskaya, Anna Carvalho, Luís Galagan, James Genome Res Method The comprehension of protein and DNA binding in vivo is essential to understand gene regulation. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) provides a global map of the regulatory binding network. Most ChIP-seq analysis tools focus on identifying binding regions from coverage enrichment. However, less work has been performed to infer the physical and regulatory details inside the enriched regions. This research extends a previous blind-deconvolution approach to develop a post-peak–calling algorithm that improves binding site resolution and predicts cooperative interactions. At the core of our new method is a physically motivated model that characterizes the binding signal as an extreme value distribution. This model suggests a mathematical framework to study physical properties of DNA shearing from the ChIP-seq coverage. The model explains the ChIP-seq coverage with two signals: The first considers DNA fragments with only a single binding event, whereas the second considers fragments with two binding events (a double-binding signal). The model incorporates motif discovery and is able to detect multiple sites in an enriched region with single-nucleotide resolution, high sensitivity, and high specificity. Our method improves peak caller sensitivity, from less than 45% up to 94%, at a false positive rate <11% for a set of 47 experimentally validated prokaryotic sites. It also improves resolution of highly enriched regions of large-scale eukaryotic data sets. The double-binding signal provides a novel application in ChIP-seq analysis: the identification of cooperative interaction. Predictions of known cooperative binding sites show a 0.85 area under an ROC curve. Cold Spring Harbor Laboratory Press 2014-10 /pmc/articles/PMC4199365/ /pubmed/25024162 http://dx.doi.org/10.1101/gr.161711.113 Text en © 2014 Gomes et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle Method
Gomes, Antonio L.C.
Abeel, Thomas
Peterson, Matthew
Azizi, Elham
Lyubetskaya, Anna
Carvalho, Luís
Galagan, James
Decoding ChIP-seq with a double-binding signal refines binding peaks to single-nucleotides and predicts cooperative interaction
title Decoding ChIP-seq with a double-binding signal refines binding peaks to single-nucleotides and predicts cooperative interaction
title_full Decoding ChIP-seq with a double-binding signal refines binding peaks to single-nucleotides and predicts cooperative interaction
title_fullStr Decoding ChIP-seq with a double-binding signal refines binding peaks to single-nucleotides and predicts cooperative interaction
title_full_unstemmed Decoding ChIP-seq with a double-binding signal refines binding peaks to single-nucleotides and predicts cooperative interaction
title_short Decoding ChIP-seq with a double-binding signal refines binding peaks to single-nucleotides and predicts cooperative interaction
title_sort decoding chip-seq with a double-binding signal refines binding peaks to single-nucleotides and predicts cooperative interaction
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4199365/
https://www.ncbi.nlm.nih.gov/pubmed/25024162
http://dx.doi.org/10.1101/gr.161711.113
work_keys_str_mv AT gomesantoniolc decodingchipseqwithadoublebindingsignalrefinesbindingpeakstosinglenucleotidesandpredictscooperativeinteraction
AT abeelthomas decodingchipseqwithadoublebindingsignalrefinesbindingpeakstosinglenucleotidesandpredictscooperativeinteraction
AT petersonmatthew decodingchipseqwithadoublebindingsignalrefinesbindingpeakstosinglenucleotidesandpredictscooperativeinteraction
AT azizielham decodingchipseqwithadoublebindingsignalrefinesbindingpeakstosinglenucleotidesandpredictscooperativeinteraction
AT lyubetskayaanna decodingchipseqwithadoublebindingsignalrefinesbindingpeakstosinglenucleotidesandpredictscooperativeinteraction
AT carvalholuis decodingchipseqwithadoublebindingsignalrefinesbindingpeakstosinglenucleotidesandpredictscooperativeinteraction
AT galaganjames decodingchipseqwithadoublebindingsignalrefinesbindingpeakstosinglenucleotidesandpredictscooperativeinteraction