Cargando…

ChIP-PaM: an algorithm to identify protein-DNA interaction using ChIP-Seq data

BACKGROUND: ChIP-Seq is a powerful tool for identifying the interaction between genomic regulators and their bound DNAs, especially for locating transcription factor binding sites. However, high cost and high rate of false discovery of transcription factor binding sites identified from ChIP-Seq data...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Song, Wang, Jianmin, Zhao, Wei, Pounds, Stanley, Cheng, Cheng
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2893127/
https://www.ncbi.nlm.nih.gov/pubmed/20525272
http://dx.doi.org/10.1186/1742-4682-7-18
Descripción
Sumario:BACKGROUND: ChIP-Seq is a powerful tool for identifying the interaction between genomic regulators and their bound DNAs, especially for locating transcription factor binding sites. However, high cost and high rate of false discovery of transcription factor binding sites identified from ChIP-Seq data significantly limit its application. RESULTS: Here we report a new algorithm, ChIP-PaM, for identifying transcription factor target regions in ChIP-Seq datasets. This algorithm makes full use of a protein-DNA binding pattern by capitalizing on three lines of evidence: 1) the tag count modelling at the peak position, 2) pattern matching of a specific tag count distribution, and 3) motif searching along the genome. A novel data-based two-step eFDR procedure is proposed to integrate the three lines of evidence to determine significantly enriched regions. Our algorithm requires no technical controls and efficiently discriminates falsely enriched regions from regions enriched by true transcription factor (TF) binding on the basis of ChIP-Seq data only. An analysis of real genomic data is presented to demonstrate our method. CONCLUSIONS: In a comparison with other existing methods, we found that our algorithm provides more accurate binding site discovery while maintaining comparable statistical power.