Cargando…

Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks

BACKGROUND: High throughput signature sequencing holds many promises, one of which is the ready identification of in vivo transcription factor binding sites, histone modifications, changes in chromatin structure and patterns of DNA methylation across entire genomes. In these experiments, chromatin i...

Descripción completa

Detalles Bibliográficos
Autores principales:	Nix, David A, Courdy, Samir J, Boucher, Kenneth M
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2008
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2628906/ https://www.ncbi.nlm.nih.gov/pubmed/19061503 http://dx.doi.org/10.1186/1471-2105-9-523

_version_	1782163748720476160
author	Nix, David A Courdy, Samir J Boucher, Kenneth M
author_facet	Nix, David A Courdy, Samir J Boucher, Kenneth M
author_sort	Nix, David A
collection	PubMed
description	BACKGROUND: High throughput signature sequencing holds many promises, one of which is the ready identification of in vivo transcription factor binding sites, histone modifications, changes in chromatin structure and patterns of DNA methylation across entire genomes. In these experiments, chromatin immunoprecipitation is used to enrich for particular DNA sequences of interest and signature sequencing is used to map the regions to the genome (ChIP-Seq). Elucidation of these sites of DNA-protein binding/modification are proving instrumental in reconstructing networks of gene regulation and chromatin remodelling that direct development, response to cellular perturbation, and neoplastic transformation. RESULTS: Here we present a package of algorithms and software that makes use of control input data to reduce false positives and estimate confidence in ChIP-Seq peaks. Several different methods were compared using two simulated spike-in datasets. Use of control input data and a normalized difference score were found to more than double the recovery of ChIP-Seq peaks at a 5% false discovery rate (FDR). Moreover, both a binomial p-value/q-value and an empirical FDR were found to predict the true FDR within 2–3 fold and are more reliable estimators of confidence than a global Poisson p-value. These methods were then used to reanalyze Johnson et al.'s neuron-restrictive silencer factor (NRSF) ChIP-Seq data without relying on extensive qPCR validated NRSF sites and the presence of NRSF binding motifs for setting thresholds. CONCLUSION: The methods developed and tested here show considerable promise for reducing false positives and estimating confidence in ChIP-Seq data without any prior knowledge of the chIP target. They are part of a larger open source package freely available from http://useq.sourceforge.net/.
format	Text
id	pubmed-2628906
institution	National Center for Biotechnology Information
language	English
publishDate	2008
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-26289062009-01-22 Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks Nix, David A Courdy, Samir J Boucher, Kenneth M BMC Bioinformatics Methodology Article BACKGROUND: High throughput signature sequencing holds many promises, one of which is the ready identification of in vivo transcription factor binding sites, histone modifications, changes in chromatin structure and patterns of DNA methylation across entire genomes. In these experiments, chromatin immunoprecipitation is used to enrich for particular DNA sequences of interest and signature sequencing is used to map the regions to the genome (ChIP-Seq). Elucidation of these sites of DNA-protein binding/modification are proving instrumental in reconstructing networks of gene regulation and chromatin remodelling that direct development, response to cellular perturbation, and neoplastic transformation. RESULTS: Here we present a package of algorithms and software that makes use of control input data to reduce false positives and estimate confidence in ChIP-Seq peaks. Several different methods were compared using two simulated spike-in datasets. Use of control input data and a normalized difference score were found to more than double the recovery of ChIP-Seq peaks at a 5% false discovery rate (FDR). Moreover, both a binomial p-value/q-value and an empirical FDR were found to predict the true FDR within 2–3 fold and are more reliable estimators of confidence than a global Poisson p-value. These methods were then used to reanalyze Johnson et al.'s neuron-restrictive silencer factor (NRSF) ChIP-Seq data without relying on extensive qPCR validated NRSF sites and the presence of NRSF binding motifs for setting thresholds. CONCLUSION: The methods developed and tested here show considerable promise for reducing false positives and estimating confidence in ChIP-Seq data without any prior knowledge of the chIP target. They are part of a larger open source package freely available from http://useq.sourceforge.net/. BioMed Central 2008-12-05 /pmc/articles/PMC2628906/ /pubmed/19061503 http://dx.doi.org/10.1186/1471-2105-9-523 Text en Copyright ©2008 Nix et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Nix, David A Courdy, Samir J Boucher, Kenneth M Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks
title	Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks
title_full	Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks
title_fullStr	Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks
title_full_unstemmed	Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks
title_short	Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks
title_sort	empirical methods for controlling false positives and estimating confidence in chip-seq peaks
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2628906/ https://www.ncbi.nlm.nih.gov/pubmed/19061503 http://dx.doi.org/10.1186/1471-2105-9-523
work_keys_str_mv	AT nixdavida empiricalmethodsforcontrollingfalsepositivesandestimatingconfidenceinchipseqpeaks AT courdysamirj empiricalmethodsforcontrollingfalsepositivesandestimatingconfidenceinchipseqpeaks AT boucherkennethm empiricalmethodsforcontrollingfalsepositivesandestimatingconfidenceinchipseqpeaks

Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks

Ejemplares similares