Cargando…

Perm-seq: Mapping Protein-DNA Interactions in Segmental Duplication and Highly Repetitive Regions of Genomes with Prior-Enhanced Read Mapping

Segmental duplications and other highly repetitive regions of genomes contribute significantly to cells’ regulatory programs. Advancements in next generation sequencing enabled genome-wide profiling of protein-DNA interactions by chromatin immunoprecipitation followed by high throughput sequencing (...

Descripción completa

Detalles Bibliográficos
Autores principales: Zeng, Xin, Li, Bo, Welch, Rene, Rojo, Constanza, Zheng, Ye, Dewey, Colin N., Keleş, Sündüz
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4618727/
https://www.ncbi.nlm.nih.gov/pubmed/26484757
http://dx.doi.org/10.1371/journal.pcbi.1004491
_version_ 1782396965879808000
author Zeng, Xin
Li, Bo
Welch, Rene
Rojo, Constanza
Zheng, Ye
Dewey, Colin N.
Keleş, Sündüz
author_facet Zeng, Xin
Li, Bo
Welch, Rene
Rojo, Constanza
Zheng, Ye
Dewey, Colin N.
Keleş, Sündüz
author_sort Zeng, Xin
collection PubMed
description Segmental duplications and other highly repetitive regions of genomes contribute significantly to cells’ regulatory programs. Advancements in next generation sequencing enabled genome-wide profiling of protein-DNA interactions by chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq). However, interactions in highly repetitive regions of genomes have proven difficult to map since short reads of 50–100 base pairs (bps) from these regions map to multiple locations in reference genomes. Standard analytical methods discard such multi-mapping reads and the few that can accommodate them are prone to large false positive and negative rates. We developed Perm-seq, a prior-enhanced read allocation method for ChIP-seq experiments, that can allocate multi-mapping reads in highly repetitive regions of the genomes with high accuracy. We comprehensively evaluated Perm-seq, and found that our prior-enhanced approach significantly improves multi-read allocation accuracy over approaches that do not utilize additional data types. The statistical formalism underlying our approach facilitates supervising of multi-read allocation with a variety of data sources including histone ChIP-seq. We applied Perm-seq to 64 ENCODE ChIP-seq datasets from GM12878 and K562 cells and identified many novel protein-DNA interactions in segmental duplication regions. Our analysis reveals that although the protein-DNA interactions sites are evolutionarily less conserved in repetitive regions, they share the overall sequence characteristics of the protein-DNA interactions in non-repetitive regions.
format Online
Article
Text
id pubmed-4618727
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-46187272015-10-29 Perm-seq: Mapping Protein-DNA Interactions in Segmental Duplication and Highly Repetitive Regions of Genomes with Prior-Enhanced Read Mapping Zeng, Xin Li, Bo Welch, Rene Rojo, Constanza Zheng, Ye Dewey, Colin N. Keleş, Sündüz PLoS Comput Biol Research Article Segmental duplications and other highly repetitive regions of genomes contribute significantly to cells’ regulatory programs. Advancements in next generation sequencing enabled genome-wide profiling of protein-DNA interactions by chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq). However, interactions in highly repetitive regions of genomes have proven difficult to map since short reads of 50–100 base pairs (bps) from these regions map to multiple locations in reference genomes. Standard analytical methods discard such multi-mapping reads and the few that can accommodate them are prone to large false positive and negative rates. We developed Perm-seq, a prior-enhanced read allocation method for ChIP-seq experiments, that can allocate multi-mapping reads in highly repetitive regions of the genomes with high accuracy. We comprehensively evaluated Perm-seq, and found that our prior-enhanced approach significantly improves multi-read allocation accuracy over approaches that do not utilize additional data types. The statistical formalism underlying our approach facilitates supervising of multi-read allocation with a variety of data sources including histone ChIP-seq. We applied Perm-seq to 64 ENCODE ChIP-seq datasets from GM12878 and K562 cells and identified many novel protein-DNA interactions in segmental duplication regions. Our analysis reveals that although the protein-DNA interactions sites are evolutionarily less conserved in repetitive regions, they share the overall sequence characteristics of the protein-DNA interactions in non-repetitive regions. Public Library of Science 2015-10-20 /pmc/articles/PMC4618727/ /pubmed/26484757 http://dx.doi.org/10.1371/journal.pcbi.1004491 Text en © 2015 Zeng et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Zeng, Xin
Li, Bo
Welch, Rene
Rojo, Constanza
Zheng, Ye
Dewey, Colin N.
Keleş, Sündüz
Perm-seq: Mapping Protein-DNA Interactions in Segmental Duplication and Highly Repetitive Regions of Genomes with Prior-Enhanced Read Mapping
title Perm-seq: Mapping Protein-DNA Interactions in Segmental Duplication and Highly Repetitive Regions of Genomes with Prior-Enhanced Read Mapping
title_full Perm-seq: Mapping Protein-DNA Interactions in Segmental Duplication and Highly Repetitive Regions of Genomes with Prior-Enhanced Read Mapping
title_fullStr Perm-seq: Mapping Protein-DNA Interactions in Segmental Duplication and Highly Repetitive Regions of Genomes with Prior-Enhanced Read Mapping
title_full_unstemmed Perm-seq: Mapping Protein-DNA Interactions in Segmental Duplication and Highly Repetitive Regions of Genomes with Prior-Enhanced Read Mapping
title_short Perm-seq: Mapping Protein-DNA Interactions in Segmental Duplication and Highly Repetitive Regions of Genomes with Prior-Enhanced Read Mapping
title_sort perm-seq: mapping protein-dna interactions in segmental duplication and highly repetitive regions of genomes with prior-enhanced read mapping
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4618727/
https://www.ncbi.nlm.nih.gov/pubmed/26484757
http://dx.doi.org/10.1371/journal.pcbi.1004491
work_keys_str_mv AT zengxin permseqmappingproteindnainteractionsinsegmentalduplicationandhighlyrepetitiveregionsofgenomeswithpriorenhancedreadmapping
AT libo permseqmappingproteindnainteractionsinsegmentalduplicationandhighlyrepetitiveregionsofgenomeswithpriorenhancedreadmapping
AT welchrene permseqmappingproteindnainteractionsinsegmentalduplicationandhighlyrepetitiveregionsofgenomeswithpriorenhancedreadmapping
AT rojoconstanza permseqmappingproteindnainteractionsinsegmentalduplicationandhighlyrepetitiveregionsofgenomeswithpriorenhancedreadmapping
AT zhengye permseqmappingproteindnainteractionsinsegmentalduplicationandhighlyrepetitiveregionsofgenomeswithpriorenhancedreadmapping
AT deweycolinn permseqmappingproteindnainteractionsinsegmentalduplicationandhighlyrepetitiveregionsofgenomeswithpriorenhancedreadmapping
AT kelessunduz permseqmappingproteindnainteractionsinsegmentalduplicationandhighlyrepetitiveregionsofgenomeswithpriorenhancedreadmapping