Cargando…

Systematic evaluation of the impact of ChIP-seq read designs on genome coverage, peak identification, and allele-specific binding detection

BACKGROUND: Chromatin immunoprecipitation followed by sequencing (ChIP-seq) experiments revolutionized genome-wide profiling of transcription factors and histone modifications. Although maturing sequencing technologies allow these experiments to be carried out with short (36–50 bps), long (75–100 bp...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Qi, Zeng, Xin, Younkin, Sam, Kawli, Trupti, Snyder, Michael P., Keleş, Sündüz
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765064/
https://www.ncbi.nlm.nih.gov/pubmed/26908256
http://dx.doi.org/10.1186/s12859-016-0957-1
_version_ 1782417493960163328
author Zhang, Qi
Zeng, Xin
Younkin, Sam
Kawli, Trupti
Snyder, Michael P.
Keleş, Sündüz
author_facet Zhang, Qi
Zeng, Xin
Younkin, Sam
Kawli, Trupti
Snyder, Michael P.
Keleş, Sündüz
author_sort Zhang, Qi
collection PubMed
description BACKGROUND: Chromatin immunoprecipitation followed by sequencing (ChIP-seq) experiments revolutionized genome-wide profiling of transcription factors and histone modifications. Although maturing sequencing technologies allow these experiments to be carried out with short (36–50 bps), long (75–100 bps), single-end, or paired-end reads, the impact of these read parameters on the downstream data analysis are not well understood. In this paper, we evaluate the effects of different read parameters on genome sequence alignment, coverage of different classes of genomic features, peak identification, and allele-specific binding detection. RESULTS: We generated 101 bps paired-end ChIP-seq data for many transcription factors from human GM12878 and MCF7 cell lines. Systematic evaluations using in silico variations of these data as well as fully simulated data, revealed complex interplay between the sequencing parameters and analysis tools, and indicated clear advantages of paired-end designs in several aspects such as alignment accuracy, peak resolution, and most notably, allele-specific binding detection. CONCLUSIONS: Our work elucidates the effect of design on the downstream analysis and provides insights to investigators in deciding sequencing parameters in ChIP-seq experiments. We present the first systematic evaluation of the impact of ChIP-seq designs on allele-specific binding detection and highlights the power of pair-end designs in such studies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0957-1) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4765064
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47650642016-02-25 Systematic evaluation of the impact of ChIP-seq read designs on genome coverage, peak identification, and allele-specific binding detection Zhang, Qi Zeng, Xin Younkin, Sam Kawli, Trupti Snyder, Michael P. Keleş, Sündüz BMC Bioinformatics Research Article BACKGROUND: Chromatin immunoprecipitation followed by sequencing (ChIP-seq) experiments revolutionized genome-wide profiling of transcription factors and histone modifications. Although maturing sequencing technologies allow these experiments to be carried out with short (36–50 bps), long (75–100 bps), single-end, or paired-end reads, the impact of these read parameters on the downstream data analysis are not well understood. In this paper, we evaluate the effects of different read parameters on genome sequence alignment, coverage of different classes of genomic features, peak identification, and allele-specific binding detection. RESULTS: We generated 101 bps paired-end ChIP-seq data for many transcription factors from human GM12878 and MCF7 cell lines. Systematic evaluations using in silico variations of these data as well as fully simulated data, revealed complex interplay between the sequencing parameters and analysis tools, and indicated clear advantages of paired-end designs in several aspects such as alignment accuracy, peak resolution, and most notably, allele-specific binding detection. CONCLUSIONS: Our work elucidates the effect of design on the downstream analysis and provides insights to investigators in deciding sequencing parameters in ChIP-seq experiments. We present the first systematic evaluation of the impact of ChIP-seq designs on allele-specific binding detection and highlights the power of pair-end designs in such studies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0957-1) contains supplementary material, which is available to authorized users. BioMed Central 2016-02-24 /pmc/articles/PMC4765064/ /pubmed/26908256 http://dx.doi.org/10.1186/s12859-016-0957-1 Text en © Zhang et al. 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Zhang, Qi
Zeng, Xin
Younkin, Sam
Kawli, Trupti
Snyder, Michael P.
Keleş, Sündüz
Systematic evaluation of the impact of ChIP-seq read designs on genome coverage, peak identification, and allele-specific binding detection
title Systematic evaluation of the impact of ChIP-seq read designs on genome coverage, peak identification, and allele-specific binding detection
title_full Systematic evaluation of the impact of ChIP-seq read designs on genome coverage, peak identification, and allele-specific binding detection
title_fullStr Systematic evaluation of the impact of ChIP-seq read designs on genome coverage, peak identification, and allele-specific binding detection
title_full_unstemmed Systematic evaluation of the impact of ChIP-seq read designs on genome coverage, peak identification, and allele-specific binding detection
title_short Systematic evaluation of the impact of ChIP-seq read designs on genome coverage, peak identification, and allele-specific binding detection
title_sort systematic evaluation of the impact of chip-seq read designs on genome coverage, peak identification, and allele-specific binding detection
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765064/
https://www.ncbi.nlm.nih.gov/pubmed/26908256
http://dx.doi.org/10.1186/s12859-016-0957-1
work_keys_str_mv AT zhangqi systematicevaluationoftheimpactofchipseqreaddesignsongenomecoveragepeakidentificationandallelespecificbindingdetection
AT zengxin systematicevaluationoftheimpactofchipseqreaddesignsongenomecoveragepeakidentificationandallelespecificbindingdetection
AT younkinsam systematicevaluationoftheimpactofchipseqreaddesignsongenomecoveragepeakidentificationandallelespecificbindingdetection
AT kawlitrupti systematicevaluationoftheimpactofchipseqreaddesignsongenomecoveragepeakidentificationandallelespecificbindingdetection
AT snydermichaelp systematicevaluationoftheimpactofchipseqreaddesignsongenomecoveragepeakidentificationandallelespecificbindingdetection
AT kelessunduz systematicevaluationoftheimpactofchipseqreaddesignsongenomecoveragepeakidentificationandallelespecificbindingdetection