Cargando…

Direct ChIP-Seq significance analysis improves target prediction

BACKGROUND: Chromatin immunoprecipitation followed by sequencing of protein-bound DNA fragments (ChIP-Seq) is an effective high-throughput methodology for the identification of context specific DNA fragments that are bound by specific proteins in vivo. Despite significant progress in the bioinformat...

Descripción completa

Detalles Bibliográficos
Autores principales: Bansal, Mukesh, Mendiratta, Geetu, Anand, Santosh, Kushwaha, Ritu, Kim, Ryan Hyunjae, Kustagi, Manju, Iyer, Archana, Chaganti, Raju SK, Califano, Andrea, Sumazin, Pavel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4460594/
https://www.ncbi.nlm.nih.gov/pubmed/26040656
http://dx.doi.org/10.1186/1471-2164-16-S5-S4
_version_ 1782375396505812992
author Bansal, Mukesh
Mendiratta, Geetu
Anand, Santosh
Kushwaha, Ritu
Kim, Ryan Hyunjae
Kustagi, Manju
Iyer, Archana
Chaganti, Raju SK
Califano, Andrea
Sumazin, Pavel
author_facet Bansal, Mukesh
Mendiratta, Geetu
Anand, Santosh
Kushwaha, Ritu
Kim, Ryan Hyunjae
Kustagi, Manju
Iyer, Archana
Chaganti, Raju SK
Califano, Andrea
Sumazin, Pavel
author_sort Bansal, Mukesh
collection PubMed
description BACKGROUND: Chromatin immunoprecipitation followed by sequencing of protein-bound DNA fragments (ChIP-Seq) is an effective high-throughput methodology for the identification of context specific DNA fragments that are bound by specific proteins in vivo. Despite significant progress in the bioinformatics analysis of this genome-scale data, a number of challenges remain as technology-dependent biases, including variable target accessibility and mappability, sequence-dependent variability, and non-specific binding affinity must be accounted for. RESULTS AND DISCUSSION: We introduce a nonparametric method for scoring consensus regions of aligned immunoprecipitated DNA fragments when appropriate control experiments are available. Our method uses local models for null binding; these are necessary because binding prediction scores based on global models alone fail to properly account for specialized features of genomic regions and chance pull downs of specific DNA fragments, thus disproportionally rewarding some genomic regions and decreasing prediction accuracy. We make no assumptions about the structure or amplitude of bound peaks, yet we show that our method outperforms leading methods developed using either global or local null hypothesis models for random binding. We test prediction performance by comparing analyses of ChIP-seq, ChIP-chip, motif-based binding-site prediction, and shRNA assays, showing high reproducibility, binding-site enrichment in predicted target regions, and functional regulation of predicted targets. CONCLUSIONS: Given appropriate controls, a direct nonparametric method for identifying transcription-factor targets from ChIP-Seq assays may lead to both higher sensitivity and higher specificity, and should be preferred or used in conjunction with methods that use parametric models for null binding.
format Online
Article
Text
id pubmed-4460594
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44605942015-06-29 Direct ChIP-Seq significance analysis improves target prediction Bansal, Mukesh Mendiratta, Geetu Anand, Santosh Kushwaha, Ritu Kim, Ryan Hyunjae Kustagi, Manju Iyer, Archana Chaganti, Raju SK Califano, Andrea Sumazin, Pavel BMC Genomics Research BACKGROUND: Chromatin immunoprecipitation followed by sequencing of protein-bound DNA fragments (ChIP-Seq) is an effective high-throughput methodology for the identification of context specific DNA fragments that are bound by specific proteins in vivo. Despite significant progress in the bioinformatics analysis of this genome-scale data, a number of challenges remain as technology-dependent biases, including variable target accessibility and mappability, sequence-dependent variability, and non-specific binding affinity must be accounted for. RESULTS AND DISCUSSION: We introduce a nonparametric method for scoring consensus regions of aligned immunoprecipitated DNA fragments when appropriate control experiments are available. Our method uses local models for null binding; these are necessary because binding prediction scores based on global models alone fail to properly account for specialized features of genomic regions and chance pull downs of specific DNA fragments, thus disproportionally rewarding some genomic regions and decreasing prediction accuracy. We make no assumptions about the structure or amplitude of bound peaks, yet we show that our method outperforms leading methods developed using either global or local null hypothesis models for random binding. We test prediction performance by comparing analyses of ChIP-seq, ChIP-chip, motif-based binding-site prediction, and shRNA assays, showing high reproducibility, binding-site enrichment in predicted target regions, and functional regulation of predicted targets. CONCLUSIONS: Given appropriate controls, a direct nonparametric method for identifying transcription-factor targets from ChIP-Seq assays may lead to both higher sensitivity and higher specificity, and should be preferred or used in conjunction with methods that use parametric models for null binding. BioMed Central 2015-05-26 /pmc/articles/PMC4460594/ /pubmed/26040656 http://dx.doi.org/10.1186/1471-2164-16-S5-S4 Text en Copyright © 2015 Bansal et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Bansal, Mukesh
Mendiratta, Geetu
Anand, Santosh
Kushwaha, Ritu
Kim, Ryan Hyunjae
Kustagi, Manju
Iyer, Archana
Chaganti, Raju SK
Califano, Andrea
Sumazin, Pavel
Direct ChIP-Seq significance analysis improves target prediction
title Direct ChIP-Seq significance analysis improves target prediction
title_full Direct ChIP-Seq significance analysis improves target prediction
title_fullStr Direct ChIP-Seq significance analysis improves target prediction
title_full_unstemmed Direct ChIP-Seq significance analysis improves target prediction
title_short Direct ChIP-Seq significance analysis improves target prediction
title_sort direct chip-seq significance analysis improves target prediction
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4460594/
https://www.ncbi.nlm.nih.gov/pubmed/26040656
http://dx.doi.org/10.1186/1471-2164-16-S5-S4
work_keys_str_mv AT bansalmukesh directchipseqsignificanceanalysisimprovestargetprediction
AT mendirattageetu directchipseqsignificanceanalysisimprovestargetprediction
AT anandsantosh directchipseqsignificanceanalysisimprovestargetprediction
AT kushwaharitu directchipseqsignificanceanalysisimprovestargetprediction
AT kimryanhyunjae directchipseqsignificanceanalysisimprovestargetprediction
AT kustagimanju directchipseqsignificanceanalysisimprovestargetprediction
AT iyerarchana directchipseqsignificanceanalysisimprovestargetprediction
AT chagantirajusk directchipseqsignificanceanalysisimprovestargetprediction
AT califanoandrea directchipseqsignificanceanalysisimprovestargetprediction
AT sumazinpavel directchipseqsignificanceanalysisimprovestargetprediction