Cargando…
Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data
Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The sta...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3136429/ https://www.ncbi.nlm.nih.gov/pubmed/21779159 http://dx.doi.org/10.1371/journal.pcbi.1002111 |
_version_ | 1782208201836462080 |
---|---|
author | Chung, Dongjun Kuan, Pei Fen Li, Bo Sanalkumar, Rajendran Liang, Kun Bresnick, Emery H. Dewey, Colin Keleş, Sündüz |
author_facet | Chung, Dongjun Kuan, Pei Fen Li, Bo Sanalkumar, Rajendran Liang, Kun Bresnick, Emery H. Dewey, Colin Keleş, Sündüz |
author_sort | Chung, Dongjun |
collection | PubMed |
description | Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The state of the art for analyzing ChIP-seq data relies on using only reads that map uniquely to a relevant reference genome (uni-reads). This can lead to the omission of up to 30% of alignable reads. We describe a general approach for utilizing reads that map to multiple locations on the reference genome (multi-reads). Our approach is based on allocating multi-reads as fractional counts using a weighted alignment scheme. Using human STAT1 and mouse GATA1 ChIP-seq datasets, we illustrate that incorporation of multi-reads significantly increases sequencing depths, leads to detection of novel peaks that are not otherwise identifiable with uni-reads, and improves detection of peaks in mappable regions. We investigate various genome-wide characteristics of peaks detected only by utilization of multi-reads via computational experiments. Overall, peaks from multi-read analysis have similar characteristics to peaks that are identified by uni-reads except that the majority of them reside in segmental duplications. We further validate a number of GATA1 multi-read only peaks by independent quantitative real-time ChIP analysis and identify novel target genes of GATA1. These computational and experimental results establish that multi-reads can be of critical importance for studying transcription factor binding in highly repetitive regions of genomes with ChIP-seq experiments. |
format | Online Article Text |
id | pubmed-3136429 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-31364292011-07-21 Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data Chung, Dongjun Kuan, Pei Fen Li, Bo Sanalkumar, Rajendran Liang, Kun Bresnick, Emery H. Dewey, Colin Keleş, Sündüz PLoS Comput Biol Research Article Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The state of the art for analyzing ChIP-seq data relies on using only reads that map uniquely to a relevant reference genome (uni-reads). This can lead to the omission of up to 30% of alignable reads. We describe a general approach for utilizing reads that map to multiple locations on the reference genome (multi-reads). Our approach is based on allocating multi-reads as fractional counts using a weighted alignment scheme. Using human STAT1 and mouse GATA1 ChIP-seq datasets, we illustrate that incorporation of multi-reads significantly increases sequencing depths, leads to detection of novel peaks that are not otherwise identifiable with uni-reads, and improves detection of peaks in mappable regions. We investigate various genome-wide characteristics of peaks detected only by utilization of multi-reads via computational experiments. Overall, peaks from multi-read analysis have similar characteristics to peaks that are identified by uni-reads except that the majority of them reside in segmental duplications. We further validate a number of GATA1 multi-read only peaks by independent quantitative real-time ChIP analysis and identify novel target genes of GATA1. These computational and experimental results establish that multi-reads can be of critical importance for studying transcription factor binding in highly repetitive regions of genomes with ChIP-seq experiments. Public Library of Science 2011-07-14 /pmc/articles/PMC3136429/ /pubmed/21779159 http://dx.doi.org/10.1371/journal.pcbi.1002111 Text en Chung et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Chung, Dongjun Kuan, Pei Fen Li, Bo Sanalkumar, Rajendran Liang, Kun Bresnick, Emery H. Dewey, Colin Keleş, Sündüz Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data |
title | Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data |
title_full | Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data |
title_fullStr | Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data |
title_full_unstemmed | Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data |
title_short | Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data |
title_sort | discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of chip-seq data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3136429/ https://www.ncbi.nlm.nih.gov/pubmed/21779159 http://dx.doi.org/10.1371/journal.pcbi.1002111 |
work_keys_str_mv | AT chungdongjun discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT kuanpeifen discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT libo discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT sanalkumarrajendran discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT liangkun discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT bresnickemeryh discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT deweycolin discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT kelessunduz discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata |