Cargando…

Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data

Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The sta...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chung, Dongjun, Kuan, Pei Fen, Li, Bo, Sanalkumar, Rajendran, Liang, Kun, Bresnick, Emery H., Dewey, Colin, Keleş, Sündüz
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2011
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3136429/ https://www.ncbi.nlm.nih.gov/pubmed/21779159 http://dx.doi.org/10.1371/journal.pcbi.1002111

_version_	1782208201836462080
author	Chung, Dongjun Kuan, Pei Fen Li, Bo Sanalkumar, Rajendran Liang, Kun Bresnick, Emery H. Dewey, Colin Keleş, Sündüz
author_facet	Chung, Dongjun Kuan, Pei Fen Li, Bo Sanalkumar, Rajendran Liang, Kun Bresnick, Emery H. Dewey, Colin Keleş, Sündüz
author_sort	Chung, Dongjun
collection	PubMed
description	Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The state of the art for analyzing ChIP-seq data relies on using only reads that map uniquely to a relevant reference genome (uni-reads). This can lead to the omission of up to 30% of alignable reads. We describe a general approach for utilizing reads that map to multiple locations on the reference genome (multi-reads). Our approach is based on allocating multi-reads as fractional counts using a weighted alignment scheme. Using human STAT1 and mouse GATA1 ChIP-seq datasets, we illustrate that incorporation of multi-reads significantly increases sequencing depths, leads to detection of novel peaks that are not otherwise identifiable with uni-reads, and improves detection of peaks in mappable regions. We investigate various genome-wide characteristics of peaks detected only by utilization of multi-reads via computational experiments. Overall, peaks from multi-read analysis have similar characteristics to peaks that are identified by uni-reads except that the majority of them reside in segmental duplications. We further validate a number of GATA1 multi-read only peaks by independent quantitative real-time ChIP analysis and identify novel target genes of GATA1. These computational and experimental results establish that multi-reads can be of critical importance for studying transcription factor binding in highly repetitive regions of genomes with ChIP-seq experiments.
format	Online Article Text
id	pubmed-3136429
institution	National Center for Biotechnology Information
language	English
publishDate	2011
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-31364292011-07-21 Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data Chung, Dongjun Kuan, Pei Fen Li, Bo Sanalkumar, Rajendran Liang, Kun Bresnick, Emery H. Dewey, Colin Keleş, Sündüz PLoS Comput Biol Research Article Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The state of the art for analyzing ChIP-seq data relies on using only reads that map uniquely to a relevant reference genome (uni-reads). This can lead to the omission of up to 30% of alignable reads. We describe a general approach for utilizing reads that map to multiple locations on the reference genome (multi-reads). Our approach is based on allocating multi-reads as fractional counts using a weighted alignment scheme. Using human STAT1 and mouse GATA1 ChIP-seq datasets, we illustrate that incorporation of multi-reads significantly increases sequencing depths, leads to detection of novel peaks that are not otherwise identifiable with uni-reads, and improves detection of peaks in mappable regions. We investigate various genome-wide characteristics of peaks detected only by utilization of multi-reads via computational experiments. Overall, peaks from multi-read analysis have similar characteristics to peaks that are identified by uni-reads except that the majority of them reside in segmental duplications. We further validate a number of GATA1 multi-read only peaks by independent quantitative real-time ChIP analysis and identify novel target genes of GATA1. These computational and experimental results establish that multi-reads can be of critical importance for studying transcription factor binding in highly repetitive regions of genomes with ChIP-seq experiments. Public Library of Science 2011-07-14 /pmc/articles/PMC3136429/ /pubmed/21779159 http://dx.doi.org/10.1371/journal.pcbi.1002111 Text en Chung et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article Chung, Dongjun Kuan, Pei Fen Li, Bo Sanalkumar, Rajendran Liang, Kun Bresnick, Emery H. Dewey, Colin Keleş, Sündüz Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data
title	Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data
title_full	Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data
title_fullStr	Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data
title_full_unstemmed	Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data
title_short	Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data
title_sort	discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of chip-seq data
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3136429/ https://www.ncbi.nlm.nih.gov/pubmed/21779159 http://dx.doi.org/10.1371/journal.pcbi.1002111
work_keys_str_mv	AT chungdongjun discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT kuanpeifen discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT libo discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT sanalkumarrajendran discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT liangkun discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT bresnickemeryh discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT deweycolin discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT kelessunduz discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata

Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data

Ejemplares similares