Cargando…

PolyaPeak: Detecting Transcription Factor Binding Sites from ChIP-seq Using Peak Shape Information

ChIP-seq is a powerful technology for detecting genomic regions where a protein of interest interacts with DNA. ChIP-seq data for mapping transcription factor binding sites (TFBSs) have a characteristic pattern: around each binding site, sequence reads aligned to the forward and reverse strands of t...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Hao, Ji, Hongkai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3946423/
https://www.ncbi.nlm.nih.gov/pubmed/24608116
http://dx.doi.org/10.1371/journal.pone.0089694
_version_ 1782306646352986112
author Wu, Hao
Ji, Hongkai
author_facet Wu, Hao
Ji, Hongkai
author_sort Wu, Hao
collection PubMed
description ChIP-seq is a powerful technology for detecting genomic regions where a protein of interest interacts with DNA. ChIP-seq data for mapping transcription factor binding sites (TFBSs) have a characteristic pattern: around each binding site, sequence reads aligned to the forward and reverse strands of the reference genome form two separate peaks shifted away from each other, and the true binding site is located in between these two peaks. While it has been shown previously that the accuracy and resolution of binding site detection can be improved by modeling the pattern, efficient methods are unavailable to fully utilize that information in TFBS detection procedure. We present PolyaPeak, a new method to improve TFBS detection by incorporating the peak shape information. PolyaPeak describes peak shapes using a flexible Pólya model. The shapes are automatically learnt from the data using Minorization-Maximization (MM) algorithm, then integrated with the read count information via a hierarchical model to distinguish true binding sites from background noises. Extensive real data analyses show that PolyaPeak is capable of robustly improving TFBS detection compared with existing methods. An R package is freely available.
format Online
Article
Text
id pubmed-3946423
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-39464232014-03-10 PolyaPeak: Detecting Transcription Factor Binding Sites from ChIP-seq Using Peak Shape Information Wu, Hao Ji, Hongkai PLoS One Research Article ChIP-seq is a powerful technology for detecting genomic regions where a protein of interest interacts with DNA. ChIP-seq data for mapping transcription factor binding sites (TFBSs) have a characteristic pattern: around each binding site, sequence reads aligned to the forward and reverse strands of the reference genome form two separate peaks shifted away from each other, and the true binding site is located in between these two peaks. While it has been shown previously that the accuracy and resolution of binding site detection can be improved by modeling the pattern, efficient methods are unavailable to fully utilize that information in TFBS detection procedure. We present PolyaPeak, a new method to improve TFBS detection by incorporating the peak shape information. PolyaPeak describes peak shapes using a flexible Pólya model. The shapes are automatically learnt from the data using Minorization-Maximization (MM) algorithm, then integrated with the read count information via a hierarchical model to distinguish true binding sites from background noises. Extensive real data analyses show that PolyaPeak is capable of robustly improving TFBS detection compared with existing methods. An R package is freely available. Public Library of Science 2014-03-07 /pmc/articles/PMC3946423/ /pubmed/24608116 http://dx.doi.org/10.1371/journal.pone.0089694 Text en © 2014 Wu, Ji http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Wu, Hao
Ji, Hongkai
PolyaPeak: Detecting Transcription Factor Binding Sites from ChIP-seq Using Peak Shape Information
title PolyaPeak: Detecting Transcription Factor Binding Sites from ChIP-seq Using Peak Shape Information
title_full PolyaPeak: Detecting Transcription Factor Binding Sites from ChIP-seq Using Peak Shape Information
title_fullStr PolyaPeak: Detecting Transcription Factor Binding Sites from ChIP-seq Using Peak Shape Information
title_full_unstemmed PolyaPeak: Detecting Transcription Factor Binding Sites from ChIP-seq Using Peak Shape Information
title_short PolyaPeak: Detecting Transcription Factor Binding Sites from ChIP-seq Using Peak Shape Information
title_sort polyapeak: detecting transcription factor binding sites from chip-seq using peak shape information
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3946423/
https://www.ncbi.nlm.nih.gov/pubmed/24608116
http://dx.doi.org/10.1371/journal.pone.0089694
work_keys_str_mv AT wuhao polyapeakdetectingtranscriptionfactorbindingsitesfromchipsequsingpeakshapeinformation
AT jihongkai polyapeakdetectingtranscriptionfactorbindingsitesfromchipsequsingpeakshapeinformation